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Preface 



These notes were prepared for the first semester of a year-long mathematical 
methods course for begining graduate students in physics. The emphasis is 
on linear operators and stresses the analogy between such operators acting 
on function spaces and matrices acting on finite dimensional spaces. The op- 
erator language then provides a unified framework for investigating ordinary 
and partial differential equations, and integral equations. 

The mathematical prerequisites for the course are a sound grasp of un- 
dergraduate calculus (including the vector calculus needed for electricity and 
magnetism courses), linear algebra (the more the better), and competence 
at complex arithmetic. Fourier sums and integrals, as well as basic ordinary 
differential equation theory receive a quick review, but it would help if the 
reader had some prior experience to build on. Contour integration is not 
required. 
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Chapter 1 

Calculus of Variations 



We begin our tour of useful mathematics with what is called the calculus of 
variations. Many physics problems can be formulated in the language of this 
calculus, and once they are there are useful tools to hand. In the text and 
associated exercises we will meet some of the equations whose solution will 
occupy us for much of our journey. 

1.1 What is it good for? 

The classical problems that motivated the creators of the calculus of varia- 
tions include: 

i) Dido's problem: In Virgil's Aeneid we read how Queen Dido of Carthage 
must find largest area that can be enclosed by a curve (a strip of bull's 
hide) of fixed length. 

ii) Plateau's problem: Find the surface of minimum area for a given set of 
bounding curves. A soap film on a wire frame will adopt this minimal- 
area configuration. 

iii) Johann Bernoulli's Brachistochrone: A bead slides down a curve with 
fixed ends. Assuming that the total energy |mt> 2 + V(x) is constant, 
find the curve that gives the most rapid descent. 

iv) Catenary: Find the form of a hanging heavy chain of fixed length by 
minimizing its potential energy. 

These problems all involve finding maxima or minima, and hence equating 
some sort of derivative to zero. In the next section we define this derivative, 
and show how to compute it. 
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1.2 Functionals 

In variational problems we are provided with an expression J[y] that "eats" 
whole functions y(x) and returns a single number. Such objects are called 
functionals to distinguish them from ordinary functions. An ordinary func- 
tion is a map / : R — > R. A functional J is a map J : C°°(R) — > R where 
C°°(R) is the space of smooth (having derivatives of all orders) functions. 
To find the function y(x) that maximizes or minimizes a given functional 
J[y] we need to define, and evaluate, its functional derivative. 

1.2.1 The functional derivative 

We restrict ourselves to expressions of the form 



where / depends on the value of y(x) and only finitely many of its derivatives. 
Such functionals are said to be local in x. 

Consider first a functional J = J fdx in which / depends only x, y and 
y'. Make a change y(x) — > y(x) + erj(x), where e is a (small) x-independent 
constant. The resultant change in J is 



If f](xi) = 77(2:2) = 0, the variation 5y(x) = ei](x) in y(x) is said to have 
"fixed endpoints." For such variations the integrated-out part [. . .]^ van- 
ishes. Defining 5 J to be the 0(e) part of J[y + erf\ — J[y], we have 




(1.1) 



J[y + er]] - J[y] 





(1.2) 
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The function 



5J _df _ d fdf 



) 



(1.3) 



5y(x) dy dx \dy' 



is called the functional (or Frechet) derivative of J with respect to y(x). We 
can think of it as a generalization of the partial derivative dJ/dyi, where the 
discrete subscript "i" on y is replaced by a continuous label "x," and sums 
over i are replaced by integrals over x: 



1.2.2 The Euler-Lagrange equation 

Suppose that we have a differentiate function J(yi, y 2 , ■ ■ ■ , y n ) of n variables 
and seek its stationary points — these being the locations at which J has its 
maxima, minima and saddlepoints. At a stationary point (yi, y 2 , ■ ■ ■ , y n ) the 
variation 



must be zero for all possible Syi. The necessary and sufficient condition for 
this is that all partial derivatives dJ/dyi, % — 1, . . . ,n be zero. By analogy, 
we expect that a functional J[y] will be stationary under fixed-endpoint vari- 
ations y(x) — > y(x) + Sy(x), when the functional derivative SJ/Sy(x) vanishes 
for all x. In other words, when 



The condition (1.6) for y(x) to be a stationary point is usually called the 
Euler-Lagrange equation. 

That 5J/5y(x) = is a sufficient condition for 5 J to be zero is clear 
from its definition in (1.2). To see that it is a necessary condition we must 
appeal to the assumed smoothness of y(x). Consider a function y(x) at which 
J[y] is stationary but where SJ/5y(x) is non-zero at some x € [a^,^]. 
Because f(y,y',x) is smooth, the functional derivative 5J/5y(x) is also a 
smooth function of x. Therefore, by continuity, it will have the same sign 
throughout some open interval containing x . By taking 5y(x) = ei](x) to be 




(1.4) 




(1.5) 




(1.6) 
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Figure 1.1: Soap film between two rings. 



zero outside this interval, and of one sign within it, we obtain a non-zero 5 J 
- in contradiction to stationarity. In making this argument, we see why it 
was essential to integrate by parts so as to take the derivative off 5y: when 
y is fixed at the endpoints, we have J 5y' dx = 0, and so we cannot find a Sy' 
that is zero everywhere outside an interval and of one sign within it. 

When the functional depends on more than one function y, then station- 
arity under all possible variations requires one equation 



5J 



2i 

dyi 



d 

dx 



-zr 7- I -7T-. I =0 



;i.7) 



for each function yi(x). 

If the function / depends on higher derivatives, y", y^ 3 \ etc., then we 
have to integrate by parts more times, and we end up with 







5J 

5y(x) 



df 
dy 



d_ 

dx 



dy' 



+ 



df_ 
dx 2 \dy" 



d 3 ( df 
dx 3 \dy^ 



+ 



1.2.3 Some applications 

Now we use our new functional derivative to address some of the classic 
problems mentioned in the introduction. 

Example: Soap film supported by a pair of coaxial rings (figure 1.1) This a 
simple case of Plateau's problem. The free energy of the soap film is equal 
to twice (once for each liquid-air interface) the surface tension a of the soap 
solution times the area of the film. The film can therefore minimize its 
free energy by minimizing its area, and the axial symmetry suggests that the 
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minimal surface will be a surface of revolution about the x axis. We therefore 
seek the profile y(x) that makes the area 

J[y} = 2tt f 2 y^l + y' 2 dx (1.9) 

of the surface of revolution the least among all such surfaces bounded by 
the circles of radii y(xi) = y\ and yix^) = yi- Because a minimum is a 
stationary point, we seek candidates for the minimizing profile y(x) by setting 
the functional derivative SJ/Sy(x) to zero. 
We begin by forming the partial derivatives 



dy v ' dy' y/l + y 



and use them to write down the Euler-Lagrange equation 



d ( yy' 



Performing the indicated derivative with respect to x gives 



M! ^_ + j^ovl_ (112) 



After collecting terms, this simplifies to 



x/TTF (i + y' 2 ) 3/2 



0. (1.13) 



The differential equation (1.13) still looks a trifle intimidating. To simplify 
further, we multiply by y' to get 



= y' yy'y" 



v/T+F (i + y /2 ) 3/2 
d 



dx y^/i + yV 

The solution to the minimization problem therefore reduces to solving 



(1.14) 
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where k is an as yet undetermined integration constant. Fortunately this 
non-linear, first order, differential equation is elementary. We recast it as 



dy = ly 1 
dx V k 



and separate variables 



dx 



dy 



1.16) 

;i.i7) 



5-i 

We now make the natural substitution y = /tcosht, whence 

dx = k J dt. 
Thus we find that x + a — nt, leading to 



y = k cosh 



x + a 



K 



We select the constants n and a to fit the endpoints y(x\) 

yi- 



(1.18) 

(1.19) 

yi and y(x 2 ) = 




h 



-L 



+L 



Figure 1.2: Hanging chain 

Example: Heavy Chain over Pulleys. We cannot yet consider the form of the 
catenary, a hanging chain of fixed length, but we can solve a simpler problem 
of a heavy flexible cable draped over a pair of pulleys located at x — ±L, 
y = h, and with the excess cable resting on a horizontal surface as illustrated 
in figure 1.2. 
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Figure 1.3: Intersection ofy = ht/L with y = cosht. 



The potential energy of the system is 

-L 



P.E. = ^2 m 9y — P9 J Vy/l + {y') 2 dx + const. (1.20) 



Here the constant refers to the unchanging potential energy 

•h 



2 x / mgy dy = mgh (1.21) 
Jo 

of the vertically hanging cable. The potential energy of the cable lying on the 
horizontal surface is zero because y is zero there. Notice that the tension in 
the suspended cable is being tacitly determined by the weight of the vertical 
segments. 

The Euler-Lagrange equations coincide with those of the soap film, so 

y =KCOsh < ^±^- (1.22) 

K 

where we have to find k and a. We have 



h = Kcosh(— L + o)/k, 
= k cosh(L + a)/n, 



(1.23) 
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x 




g 



Figure 1.4: Bead on a wire. 



so a = and h = kcos\iL/k. Setting t = L/k this reduces to 

^t = cosht. (1.24) 

By considering the intersection of the line y = ht/L with y = cosht (figure 
1.3) we see that if h/L is too small there is no solution (the weight of the 
suspended cable is too big for the tension supplied by the dangling ends) 
and once h/L is large enough there will be two possible solutions. Further 
investigation will show that the solution with the larger value of k is a point 
of stable equilibrium, while the solution with the smaller k is unstable. 
Example: The Brachistochrone. This problem was posed as a challenge by 
Johann Bernoulli in 1696. He asked what shape should a wire with endpoints 
(0, 0) and (a, b) take in order that a frictionless bead will slide from rest down 
the wire in the shortest possible time (figure 1.4). The problem's name comes 
from Greek: (5pa\iaTO^ means shortest and XP 0V °^ means time. 

When presented with an ostensibly anonymous solution, Johann made his 
famous remark: "Tanquam ex unguem leonem" (I recognize the lion by his 
clawmark) meaning that he recognized that the author was Isaac Newton. 

Johann gave a solution himself, but that of his brother Jacob Bernoulli 
was superior and Johann tried to pass it off as his. This was not atypical. 
Johann later misrepresented the publication date of his book on hydraulics 
to make it seem that he had priority in this field over his own son, Daniel 
Bernoulli. 
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Figure 1.5: A wheel rolls on the The dot, which is fixed to the rim of 

the wheel, traces out a cycloid. 



We begin our solution of the problem by observing that the total energy 
E = -m(x 2 + y 2 ) — mgy = -mx 2 (l + y' 2 ) — mgy, (1-25) 

of the bead is constant. From the initial condition we see that this constant 
is zero. We therefore wish to minimize 



dt= -dx= J —dx (1.26) 

! ± J V Zgy K } 



JO 



so as find y(x), given that y(Q) = and y(a) = b. The Euler-Lagrange 
equation is 

yy" + l -(l + y*)=Q. (1.27) 

Again this looks intimidating, but we can use the same trick of multiplying 
through by y' to get 

v (yy" + \{i + y' 2 )) = ~ {y(i + y' 2 )} = o. (1.28) 

Thus 

2c = y(l + y' 2 ). (1.29) 
This differential equation has a parametric solution 

x = c(Q — sin 9), 

y = c(l-cos0), (1.30) 
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(as you should verify) and the solution is the cycloid shown in figure 1.5. 
The parameter c is determined by requiring that the curve does in fact pass 
through the point (a, b). 



1.2 A First integral 

How did we know that we could simplify both the soap-film problem and 
the brachistochrone by multiplying the Euler equation by y'7 The answer 
is that there is a general principle, closely related to energy conservation in 
mechanics, that tells us when and how we can make such a simplification. 
The y' trick works when the / in J f dx is of the form f(y,y'), i.e. has no 
explicit dependence on x. In this case the last term in 

± = y'^L + y"^L + ^l (131) 

dx dy dy 1 dx 

is absent. We then have 

±(f-y'?l\ = y'^L + y"^L-y"^L-y'±(^L 

dx \ dy' J dy dy' dy' dx \dy' 

= „(:' ? ;'("». ,, 32) 



dy dx \dy' 

and this is zero if the Euler-Lagrange equation is satisfied. 
The quantity 

i = f-y% a-33) 

is called a first integral of the Euler-Lagrange equation. In the soap-film case 

/ - W*l = yJTTWT 2 ~ V{y ' ? = ~ • (1-34) 

When there are a number of dependent variables y^ so that we have 

J[yi, 2/2, • • ■ y n ] = J f(yi, 2/2, •• • y n \ y[, y' 2 ,--- y' n ) dx t 1 - 35 ) 



then the first integral becomes 
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Again 



dl_ 

dx 




) 



) 



(1.37) 



and this zero if the Euler-Lagrange equation is satisfied for each yj. 

Note that there is only one first integral, no matter how many y^s there 

are. 

1.3 Lagrangian Mechanics 

In his Mecanique Analytique (1788) Joseph-Louis de La Grange, following 
Jean d'Alembert (1742) and Pierre de Maupertuis (1744), showed that most 
of classical mechanics can be recast as a variational condition: the principle 
of least action. The idea is to introduce the Lagrangian function L = T — V 
where T is the kinetic energy of the system and V the potential energy, both 
expressed in terms of generalized co-ordinates q % and their time derivatives 
q l . Then, Lagrange showed, the multitude of Newton's F = ma equations, 
one for each particle in the system, can be reduced to 



one equation for each generalized coordinate q. Quite remarkably — given 
that Lagrange's derivation contains no mention of maxima or minima — we 
recognise that this is precisely the condition that the action functional 



be stationary with respect to variations of the trajectory q l (t) that leave the 
initial and final points fixed. This fact so impressed its discoverers that they 
believed they had uncovered the unifying principle of the universe. Mauper- 
tuis, for one, tried to base a proof of the existence of God on it. Today the 
action integral, through its starring role in the Feynman path-integral for- 
mulation of quantum mechanics, remains at the heart of theoretical physics. 






(1.39) 
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Figure 1.6: Atwood's machine. 



1.3.1 One degree of freedom 

We shall not attempt to derive Lagrange's equations from d'Alembert's ex- 
tension of the principle of virtual work - leaving this task to a mechanics 
course — but instead satisfy ourselves with some examples which illustrate 
the computational advantages of Lagrange's approach, as well as a subtle 
pitfall. 

Consider, for example, Atwood's Machine (figure 1.6). This device, in- 
vented in 1784 but still a familiar sight in teaching laboratories, is used to 
demonstrate Newton's laws of motion and to measure g. It consists of two 
weights connected by a light string of length I which passes over a light and 
frictionless pulley 

The elementary approach is to write an equation of motion for each of 
the two weights 

m\X\ = m\g — T, 

rri2X2 = m>2g — T. (1-40) 

We then take into account the constraint X\ = —x<i and eliminate X2 in favour 
of X\: 



m\X\ = mig — T, 
— IJI2X1 = rri2g — T. 



(1.41) 
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Finally we eliminate the constraint force, the tension T, and obtain the 
acceleration 

(mi + TH2)xi = (mi — m<i)g. (1-42) 

Lagrange's solution takes the constraint into account from the very be- 
ginning by introducing a single generalized coordinate q = x\ = I — and 
writing 

L = T - V = -^(mi + m 2 )q 2 - (m 2 - mx)gq. (1-43) 

From this we obtain a single equation of motion 

d ( dL\ dL . . .. . . _ A A . 

dt\W)~ ~dq i= ^ + = ~ 

The advantage of the the Lagrangian method is that constraint forces, which 
do no net work, never appear. The disadvantage is exactly the same: if we 
need to find the constraint forces - in this case the tension in the string - 
we cannot use Lagrange alone. 

Lagrange provides a convenient way to derive the equations of motion in 
non-cartesian co-ordinate systems, such as plane polar co-ordinates. 




Consider the central force problem with F r = —d r V(r). Newton's method 
begins by computing the acceleration in polar coordinates. This is most 
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easily done by setting z = re l6 and differentiating twice: 

i = (f + ir6)e id , 

z = {f -r9 2 )e ie + i{2r9 + r9)e ie . (1.45) 

Reading off the components parallel and perpendicular to e td gives the radial 
and angular acceleration 

a r = r — r9 , 

a e = r9 + 2r9. (1.46) 
Newton's equations therefore become 

m(r — rti ) = — — — 

or 

m(r6 + 2f6) = 0, ^-(mr 2 9) = 0. (1.47) 

at 

Setting I = mr 2 9, the conserved angular momentum, and eliminating 9 gives 

I 2 dV 

mr 3=— 1-48 

(If this were Kepler's problem, where V = GmM/r, we would now proceed 
to simplify this equation by substituting r — 1/u, but that is another story.) 

Following Lagrange we first compute the kinetic energy in polar coordi- 
nates (this requires less thought than computing the acceleration) and set 

L = T- V = \m(r 2 +r 2 9 2 ) - V(r). (1.49) 
The Euler-Lagrange equations are now 

• 2 dV 

= 0, =^> mr — mr9 + — — = 0, 

or 

= 0, ^-(mr 2 9) = 0, (1.50) 
and coincide with Newton's. 



d 




dL 


Jt ' 


V_ df ) 


dr 


±\ 


(3L\ 


dL 


dt 1 


Kd9 


~ ~d9 
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The first integral is 

■ dL n dL r 

E = r— — h — - — L 

Or d9 

= i m (r 2 + r 2 2 ) + y(r). (1.51) 
which is the total energy. Thus the constancy of the first integral states that 

f = o, (i, 2 ) 

or that energy is conserved. 

Warning: We might realize, without having gone to the trouble of deriving 
it from the Lagrange equations, that rotational invariance guarantees that 
the angular momentum I = mr 2 6 is constant. Having done so, it is almost 
irresistible to try to short-circuit some of the labour by plugging this prior 
knowledge into 

L = )-m{r- 2 + r 2 6 2 ) -V(r) (1.53) 

so as to eliminate the variable 6 in favour of the constant I. If we try this we 

get 

? 1 I 2 
L-^-mr 2 + - y(r). (1.54) 

We can now directly write down the Lagrange equation r, which is 

mr H = — — . (1.55) 

mr 6 or 

Unfortunately this has the wrong sign before the l 2 /mr 3 term! The lesson is 
that we must be very careful in using consequences of a variational principle 
to modify the principle. It can be done, and in mechanics it leads to the 
Routhian or, in more modern language to Hamiltonian reduction, but it 
requires using a Legendre transform. The reader should consult a book on 
mechanics for details. 



1.3.2 Noether's theorem 

The time-independence of the first integral 

41^-4=0, (1.56) 



dt \ dq 
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and of angular momentum 

^-{mr 2 9} = 0, (1.57) 
at 

are examples of conservation laws. We obtained them both by manipulating 
the Euler-Lagrange equations of motion, but also indicated that they were 
in some way connected with symmetries. One of the chief advantages of a 
variational formulation of a physical problem is that this connection 

Symmetry <^ Conservation Law 

can be made explicit by exploiting a strategy due to Emmy Noether. She 
showed how to proceed directly from the action integral to the conserved 
quantity without having to fiddle about with the individual equations of 
motion. We begin by illustrating her technique in the case of angular mo- 
mentum, whose conservation is a consequence the rotational symmetry of 
the central force problem. The action integral for the central force problem 
is 

S = J^ !^m(r 2 + r 2 9 2 ) -V(r)j dt. (1.58) 

Noether observes that the integrand is left unchanged if we make the variation 

9(t)^9(t)+ea (1.59) 

where a is a fixed angle and e is a small, time-independent, parameter. This 
invariance is the symmetry we shall exploit. It is a mathematical identity: 
it does not require that r and 9 obey the equations of motion. She next 
observes that since the equations of motion are equivalent to the statement 
that S is left stationary under any infinitesimal variations in r and 9, they 
necessarily imply that S is stationary under the specific variation 

9{t) -> 9(t) + e(t)a (1.60) 

where now e is allowed to be time-dependent. This stationarity of the action 
is no longer a mathematical identity, but, because it requires r, 9, to obey 
the equations of motion, has physical content. Inserting 59 = e(t)a into our 
expression for S gives 




(1.61) 
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Note that this variation depends only on the time derivative of e, and not e 
itself. This is because of the invariance of S under time-independent rota- 
tions. We now assume that e(t) — at t — and t = T, and integrate by 
parts to take the time derivative off e and put it on the rest of the integrand: 

5S= -a J S^j t (mr 2 9)^ e(t) dt. (1.62) 

Since the equations of motion say that SS = under all infinitesimal varia- 
tions, and in particular those due to any time dependent rotation e(t)a, we 
deduce that the equations of motion imply that the coefficient of e(t) must 
be zero, and so, provided r(t), 9(t), obey the equations of motion, we have 

= ^(mr 2 9). (1.63) 

As a second illustration we derive energy (first integral) conservation for 
the case that the system is invariant under time translations — meaning 
that L does not depend explicitly on time. In this case the action integral 
is invariant under constant time shifts t — > t + e in the argument of the 
dynamical variable: 

q(t) -> q(t + e) « q(t) + eq. (1.64) 

The equations of motion tell us that that the action will be stationary under 
the variation 

5q(t)=e(t)q, (1.65) 

where again we now permit the parameter e to depend on t. We insert this 
variation into 

S= [ Ldt (1.66) 
Jo 

and find 

6S = Jo {^ £+ ^ £ + 4i) } dt (L67) 

This expression contains undotted e's. Because of this the change in S is not 
obviously zero when e is time independent — but the absence of any explicit 
t dependence in L tells us that 
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As a consequence, for time independent e, we have 




(1.69) 



showing that the change in S comes entirely from the endpoints of the time 
interval. These fixed endpoints explicitly break time-translation invariance, 
but in a trivial manner. For general e(t) we have 



This equation is an identity. It does not rely on q obeying the equation of 
motion. After an integration by parts, taking e(t) to be zero at t — 0, T, it 
is equivalent to 



Now we assume that q(t) does obey the equations of motion. The variation 
principle then says that 5S = for any e(t), and we deduce that for q(t) 
satisfying the equations of motion we have 



The general strategy that constitutes "Noether's theorem" must now be 
obvious: we look for an invariance of the action under a symmetry trans- 
formation with a time-independent parameter. We then observe that if the 
dynamical variables obey the equations of motion, then the action principle 
tells us that the action will remain stationary under such a variation of the 
dynamical variables even after the parameter is promoted to being time de- 
pendent. The resultant variation of S can only depend on time derivatives of 
the parameter. We integrate by parts so as to take all the time derivatives off 
it, and on to the rest of the integrand. Because the parameter is arbitrary, 
we deduce that the equations of motion tell us that that its coefficient in the 
integral must be zero. This coefficient is the time derivative of something, so 
this something is conserved. 

1.3.3 Many degrees of freedom 

The extension of the action principle to many degrees of freedom is straight- 
forward. As an example consider the small oscillations about equilibrium of 




(1.70) 




(1.71) 




(1.72) 
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a system with N degrees of freedom. We parametrize the system in terms of 
deviations from the equilibrium position and expand out to quadratic order. 
We obtain a Lagrangian 




(1.73) 



where and V^- are N x N symmetric matrices encoding the inertial and 
potential energy properties of the system. Now we have one equation 

»=l(©-|=z:ww) 

j=i 

for each i. 



1.3.4 Continuous systems 

The action principle can be extended to field theories and to continuum me- 
chanics. Here one has a continuous infinity of dynamical degrees of freedom, 
either one for each point in space and time or one for each point in the mate- 
rial, but the extension of the variational derivative to functions of more than 
one variable should possess no conceptual difficulties. 

Suppose we are given an action functional S[ip] depending on a field ip(x^) 
and its first derivatives 

Here x^, /i — 0, 1, . . . , d, are the coordinates of d+ 1 dimensional space-time. 
It is traditional to take x° = t and the other coordinates spacelike. Suppose 
further that 

S[<p] = J Ldt = J C(x^, V , Vfx )d d+l x, (1.76) 
where C is the Lagrangian density, in terms of which 



Cd d x, (1.77) 
and the integral is over the space coordinates. Now 
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In going from the first line to the second, we have observed that 

d 

and used the divergence theorem, 



(1.79) 




d 



m+l 



X = 




A^dS, 



(1.80) 



where Q is some space-time region and dQ its boundary, to integrate by 
parts. Here dS is the element of area on the boundary, and the outward 
normal. As before, we take 5(f to vanish on the boundary, and hence there 
is no boundary contribution to variation of S. The result is that 



and the equation of motion comes from setting this to zero. Note that a sum 
over the repeated coordinate index /i is implied. In practice it is easier not to 
use this formula. Instead, make the variation by hand — as in the following 
examples. 

Example: The Vibrating string. The simplest continuous dynamical system 
is the transversely vibrating string. We describe the string displacement by 
y(x,t). 



Let us suppose that the string has fixed ends, a mass per unit length 
of p, and is under tension T. If we assume only small displacements from 
equilibrium, the Lagrangian is 






L 



Figure 1.8: Transversely vibrating string 




(1.82) 
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The dot denotes a partial derivative with respect to t, and the prime a partial 
derivative with respect to x. The variation of the action is 

SS = JJ^ dtdx{py5y -Ty'5y'} 

= J J dtdx{5y(x,t)(-py + Ty")}. (1.83) 

To reach the second line we have integrated by parts, and, because the ends 
are fixed, and therefore 5y = at x = and L, there is no boundary term. 
Requiring that SS = for all allowed variations Sy then gives the equation 
of motion 

py-Ty" = (1.84) 

This is the wave equation describing transverse waves propagating with speed 
c = yjT/ p. Observe that from (1.83) we can read off the functional derivative 
of S with respect to the variable y(x, t) as being 

^ = +I y (M) . (i.85) 

In writing down the first integral for this continuous system, we must 
replace the sum over discrete indices by an integral: 



When computing 5L/Sy(x) from 



L 



2 ^rp n fl 



L — dx { -pii Ty 

Jo \r 2 

we must remember that it is the continuous analogue of dL/dqi, and so, in 
contrast to what we do when computing 5S/5y(x), we must treat y(x) as a 
variable independent of y(x). We then have 

igj = /*<*>■ (h87) 

leading to 

E = dx^-pf+ l -Ty' 2 Y (1.88) 
This, as expected, is the total energy, kinetic plus potential, of the string. 
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The energy-momentum tensor 

If we consider an action of the form 



J £(<p,<p^d d+l x, (1.89) 



in which C does not depend explicitly on any of the co-ordinates x^, we may 
refine Noether's derivation of the law of conservation total energy and obtain 
accounting information about the position-dependent energy density. To do 
this we make a variation of the form 

<p(x) -> ip(x^ + £»(x)) = if(x^) + e^(x)d II if + 0(\e\ 2 ), (1.90) 

where e depends on x = (x°, . . . , x d ). The resulting variation in S is 

= S^M^-wM^ 1 - (L91) 

When <f satisfies the the equations of motion this SS will be zero for arbitrary 
e^(x). We conclude that 

M^-wM^- (L92) 

The (d + l)-by-(d + 1) array of functions 

dC 

T\ = —d^-S;jC (1.93) 

is known as the canonical energy-momentum tensor because the statement 

d v T\ = (1.94) 

often provides book-keeping for the flow of energy and momentum. 

In the case of the vibrating string, the \x = 0, 1 components of d v T v ' ^ = 
become the two following local conservation equations: 

|{f* 2+ M + J;™ =0 - (L95) 
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and 

| { _^. } + 1^ + 1^0. (1.96) 

It is easy to verify that these are indeed consequences of the wave equation. 
They are "local" conservation laws because they are of the form 

|^ + divJ = 0, (1.97) 

where q is the local density, and J the flux, of the globally conserved quantity 
Q = J qd d x. In the first case, the local density q is 

T\ = \y 2 + \y'\ (i-98) 

which is the energy density. The energy flux is given by T\ = —Tyy', which 
is the rate that a segment of string is doing work on its neighbour to the right. 
Integrating over x, and observing that the fixed-end boundary conditions are 
such that 

J ■^{-Tyy'}dx=[-Tyy'^ = 0, (1.99) 

gives us 

d f L \ P -2 , T 12 



MJ a = < L1 °°> 

which is the global energy conservation law we obtained earlier. 

The physical interpretation of T\ = —pyy', the locally conserved quan- 
tity appearing in (1.96) is less obvious. If this were a relativistic system, 
we would immediately identify J T\ dx as the ^-component of the energy- 
momentum 4-vector, and therefore T\ as the density of rr-momentum. Now 
any real string will have some motion in the x direction, but the magni- 
tude of this motion will depend on the string's elastic constants and other 
quantities unknown to our Lagrangian. Because of this, the T\ derived 
from L cannot be the string's x-momentum density. Instead, it is the den- 
sity of something called pseudo-momentum. The distinction between true 
and pseudo-momentum is best appreaciated by considering the correspond- 
ing Noether symmetry. The symmetry associated with Newtonian momen- 
tum is the invariance of the action integral under an x translation of the 
entire apparatus: the string, and any wave on it. The symmetry associ- 
ated with pseudo-momentum is the invariance of the action under a shift 
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y(x) — > y(x — a) of the location of the wave on the string — the string it- 
self not being translated. Newtonian momentum is conserved if the ambient 
space is translationally invariant. Pseudo- momentum is conserved only if the 
string is translationally invariant — i.e. if p and T are position independent. 
A failure to realize that the presence of a medium (here the string) requires us 
to distinguish between these two symmetries is the origin of much confusion 
involving "wave momentum." 



Maxwell's equations 

Michael Faraday and and James Clerk Maxwell's description of electromag- 
netism in terms of dynamical vector fields gave us the first modern field 
theory. D'Alembert and Maupertuis would have been delighted to discover 
that the famous equations of Maxwell's A Treatise on Electricity and Mag- 
netism (1873) follow from an action principle. There is a slight complication 
stemming from gauge invariance but, as long as we are not interested in ex- 
hibiting the covariance of Maxwell under Lorentz transformations, we can 
sweep this under the rug by working in the axial gauge, where the scalar 
electric potential does not appear. 

We will start from Maxwell's equations 

divB = 0, 
9B 

curlE = — — , 

dt 

<9D 

curlH = J + -5T, 
at 

divD = p, (1.101) 

and show that they can be obtained from an action principle. For convenience 
we shall use natural units in which p = e = 1, and so c = 1 and D = E 
and B = H. 

The first equation divB = contains no time derivatives. It is a con- 
straint which we satisfy by introducing a vector potential A such that B =curl A. 
If we set 

E=-f, (1.102) 
then this automatically implies Faraday's law of induction 

<9B 

curlE = - — . (1.103) 
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We now guess that the Lagrangian is 



d 3 x 



-{E 2 -B 2 }+J-A 

2 



1.104) 



The motivation is that L looks very like T — V if we regard |E 2 = |A 2 as 
being the kinetic energy and |B 2 = |(curl A) 2 as being the potential energy. 
The term in J represents the interaction of the fields with an external current 
source. In the axial gauge the electric charge density p does not appear in 
the Lagrangian. The corresponding action is therefore 



S = J Ldt = JJ d 3 x 
Now vary A to A + 5 A, whence 



^A 2 - i (curl A) 2 + J • A 



dt. 



SS 



d 3 x 



-A-5A- (curl A) • (curl 5 A) + J • 8 A 



dt. 



;i.l05) 



(1.106) 



Here, we have already removed the time derivative from 5A by integrating 
by parts in the time direction. Now we do the integration by parts in the 
space directions by using the identity 

div(5A x (curl A)) = (curl A) • (curl^A) - 5 A ■ (curl (curl A)) (1.107) 

and taking SA to vanish at spatial infinity, so the surface term, which would 
come from the integral of the total divergence, is zero. We end up with 

SS = JJ d 3 x {<5A • -A - curl (curl A) + j] j dt. (1.108) 



Demanding that the variation of S be zero thus requires 

d 2 A 



dt 2 



= —curl (curl A) + J, 



or, in terms of the physical fields, 



curl B = J + 



<9E 
~dt' 



(1.109) 



(1-HO) 



This is Ampere's law, as modified by Maxwell so as to include the displace- 
ment current. 
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How do we deal with the last Maxwell equation, Gauss' law, which asserts 
that div E = p? If p were equal to zero, this equation would hold if div A = 0, 
i.e. if A were solenoidal. In this case we might be tempted to impose the 
constraint div A = on the vector potential, but doing so would undo all 
our good work, as we have been assuming that we can vary A freely. 

We notice, however, that the three Maxwell equations we already possess 
tell us that 



d_ 

dt 



(div E - p) = div (curl B) - div J 



dp 
dt 



(1.111) 



Now div (curl B) = 0, so the left-hand side is zero provided charge is con- 
served, i.e. provided 

p + divJ = 0. (1-112) 

We assume that this is so. Thus, if Gauss' law holds initially, it holds eter- 
nally. We arrange for it to hold at t = by imposing initial conditions on 
A. We first choose A| t=0 by requiring it to satisfy 



B| t=0 = curl (A| t=0 ) . 



1.113) 



The solution is not unique, because may we add any V0 to A| t= o, but this 
does not affect the physical E and B fields. The initial "velocities" A| t= o 
are then fixed uniquely by A| t= o = — E| t= o, where the initial E satisfies 
Gauss' law. The subsequent evolution of A is then uniquely determined by 
integrating the second-order equation (1.109). 
The first integral for Maxwell is 



(fx 



{E 2 + B 2 }- J- A 



(1.114) 



This will be conserved if J is time independent. If J = 0, it is the total field 
energy. 

Suppose J is neither zero nor time independent. Then, looking back at 
the derivation of the time-independence of the first integral, we see that if L 
does depend on time, we instead have 
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In the present case we have 

~ dl 

so that 



^ = - / j- Ad 3 x, (1.116) 



- J j-Ad 3 x = ^ = ^ (Field Energy) ~y |j- A + j- a} rf 3 x. (1.117) 
Thus, cancelling the duplicated term and using E = —A, we find 

-(Field Energy) = - J J • Ed 3 x. (1.118) 



Now J J • (— E) d 3 x is the rate at which the power source driving the current 
is doing work against the field. The result is therefore physically sensible. 



Continuum mechanics 

Because the mechanics of discrete objects can be derived from an action 
principle, it seems obvious that so must the mechanics of continua. This is 
certainly true if we use the Lagrangian description where we follow the his- 
tory of each particle composing the continuous material as it moves through 
space. In fluid mechanics it is more natural to describe the motion by using 
the Eulerian description in which we focus on what is going on at a partic- 
ular point in space by introducing a velocity field v(r,t). Eulerian action 
principles can still be found, but they seem to be logically distinct from the 
Lagrangian mechanics action principle, and mostly were not discovered until 
the 20th century. 

We begin by showing that Euler's equation for the irrotational motion 
of an inviscid compressible fluid can be obtained by applying the action 
principle to a functional 

S[<j>,p} = J dtdPx{p^ + ±p(V<l>) 2 + u(j>)y (1.119) 

where p is the mass density and the flow velocity is determined from the 
velocity potential by v = V0. The function u(p) is the internal energy 
density. 
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Varying S[<f),p] with respect to p is straightforward, and gives a time 
dependent generalization of (Daniel) Bernoulli's equation 

^ + i v 2 + %) = 0. (1.120) 

Here h(p) = du/dp, is the specific enthalpy. 1 Varying with respect to 
requires an integration by parts, based on 

div(p<50V0) = p(V50) • (V0) + 50div(pV0), (1.121) 

and gives the equation of mass conservation 

^ + div(pv) = 0. (1.122) 

Taking the gradient of Bernoulli's equation, and using the fact that for po- 
tential flow the vorticity u) = curlv is zero and so d(Vj = djVi, we find that 

— + (v • V)v = -Vh. (1.123) 
at 

We now introduce the pressure P, which is related to h by 

h{p)= [m (1124) 

We see that pVh = VP, and so obtain Euler's equation 

P (^ + (V ' V)V ) = " VP - (L125) 

For future reference, we observe that combining the mass-conservation equa- 
tion 

d t p + d j {pv j } = (1.126) 

with Euler's equation 

p(d t v i + v j d j v i ) = -d i P (1.127) 



1 The enthalpy H = U + PV per unit mass. In general u and h will be functions of 
both the density and the specific entropy. By taking u to depend only on p we are tacitly 
assuming that specific entropy is constant. This makes the resultant flow barotropic, 
meaning that the pressure is a function of the density only. 
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yields 



d t {pvi} + dj {pViVj + 5ijP} = 0, 



(1.128) 



which expresses the local conservation of momentum. The quantity 



is the momentum-flux tensor, and is the j-th component of the flux of the 
i-th component pi = pvi of momentum density. 

The relations h = du/dp and p = dP/dh show that P and u are related 
by a Legendre transformation: P = ph — u{p). From this, and the Bernoulli 
equation, we see that the integrand in the action (1.119) is equal to minus 
the pressure: 



This Eulerian formulation cannot be a "follow the particle" action prin- 
ciple in a clever disguise. The mass conservation law is only a consequence 
of the equation of motion, and is not built in from the beginning as a con- 
straint. Our variations in <fi are therefore conjuring up new matter rather 
than merely moving it around. 



We now relax our previous assumption that all boundary or surface terms 
arising from integrations by parts may be ignored. We will find that variation 
principles can be very useful for working out what boundary conditions we 
should impose on our differential equations. 

Consider the problem of building a railway across a parallel sided isthmus. 



YLij = fir, i- j + SijP 



(1.129) 





1.4 Variable End Points 
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x 



Figure 1.9: Railway across isthmus. 

Suppose that the cost of construction is proportional to the length of the 
track, but the cost of sea transport being negligeable, we may locate the 
terminal seaports wherever we like. We therefore wish to minimize the length 

L[y] = f 2 v 7 ! + (y') 2 dx, (1.131) 



by allowing both the path y(x) and the endpoints y(x\) and y(x2) to vary. 
Then 



L[y + Sy]-L[y) = H {5y>) jJ— dx 



Tx ( 5 Vi + (y') 2 ) ~ Sy ^ 1 7i+W 1 } dx 



.< : 1 



dx \ y/l + (y'f 



We have stationarity when both 

i) the coefficient of 5y(x) in the integral, 

d ( y' 



1.133) 



dx y^TTj^y) ' 

is zero. This requires that y' =const., i.e. the track should be straight. 
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ii) The coefficients of 5y(xi) and 5y(x2) vanish. For this we need 

= y ' {Xl) = y ' {X2) (1.134) 

v/TTFF y/T+W 

This in turn requires that y'(xi) = y'(x 2 ) = 0. 
The integrated-out bits have determined the boundary conditions that are to 
be imposed on the solution of the differential equation. In the present case 
they require us to build perpendicular to the coastline, and so we go straight 
across the isthmus. When boundary conditions are obtained from endpoint 
variations in this way, they are called natural boundary conditions. 
Example: Sliding String. A massive string of linear density p is stretched 
between two smooth posts separated by distance 2L. The string is under 
tension T, and is free to slide up and down the posts. We consider only a 
small deviations of the string from the horizontal. 



y 




Figure 1.10: Sliding string. 



As we saw earlier, the Lagrangian for a stretched string is 

l = f L {\py 2 -\ T U) 2 } dx - (i.i35) 

Now, Lagrange's principle says that the equation of motion is found by re- 
quiring the action 




(1.136) 



to be stationary under variations of y(x, t) that vanish at the initial and final 
times, ti and tf. It does not demand that Sy vanish at ends of the string, 
x = ±L. So, when we make the variation, we must not assume this. Taking 
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care not to discard the results of the integration by parts in the x direction, 
we find 



5S 



[ tf [ 6y(x, t) {-py + Ty"} dxdt - [ ' 6y(L, t)Ty'(L) dt 
hi J-L Ju 

+ f f 5y(-L,t)Ty'(-L)dt. (1.137) 
Ju 



The equation of motion, which arises from the variation within the interval, 
is therefore the wave equation 



pij - Ty" = 0. 



(1.138) 



The boundary conditions, which come from the variations at the endpoints, 
are 

y'(L,t) = y'(-L,t)=0, (1.139) 

at all times t. These are the physically correct boundary conditions, because 
any up-or-down component of the tension would provide a finite force on an 
infinitesimal mass. The string must therefore be horizontal at its endpoints. 
Example: Bead and String. Suppose now that a bead of mass M is free to 
slide up and down the y axis, 




Figure 1.11: A bead connected to a string. 

and is is attached to the x = end of our string. The Lagrangian for the 
string-bead contraption is 



L = hl[y(0)] 2 



\py 2 - \ T y' 2 \ dx - 



1.140) 
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Here, as before, p is the mass per unit length of the string and T is its tension. 
The end of the string at x = L is fixed. By varying the action S = J Ldt, 
and taking care not to throw away the boundary part at x = we find that 

,)$■■ I [Ty'-My} x=0 5y(0,t)dt 



U JO 



{Ty" 



py} &y{ x i t) dxdt. 

(1.1411 



The Euler-Lagrange equations are therefore 

py\x) - Ty"{x) = 0, < x < L, 

My(0)-Ty'(0) = 0, y(L) = 0. (1.142) 

The boundary condition at x = is the equation of motion for the bead. It 
is clearly correct, because Ty'(0) is the vertical component of the force that 
the string tension exerts on the bead. 

These examples led to boundary conditions that we could easily have 
figured out for ourselves without the variational principle. The next exam- 
ple shows that a variational formulation can be exploited to obtain a set of 
boundary conditions that might be difficult to write down by purely "physi- 
cal" reasoning. 




Figure 1.12: Gravity waves on water. 

Harder example: Gravity waves on the surface of water. An action suitable 
for describing water waves is given by 2 S[(f>, h] = J L dt, where 

hix ' l) (d(f> , 1^_ 



dxj p <j + -iy<PY + gy } dy. (1.143) 



2 J. C. Luke, J. Fluid Dynamics, 27 (1967) 395. 
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Here <p is the velocity potential and po is the density of the water. The density 
will not be varied because the water is being treated as incompressible. As 
before, the flow velocity is given by v = V0. By varying <f>(x, y, t) and the 
depth h(x,t), and taking care not to throw away any integrated-out parts of 
the variation at the physical boundaries, we obtain: 



V 2 = 0, within the fluid. 
— + -(V0) 2 + gy = 0, on the free surface. 

— = 0, on y = 0. 
oy 

dh d(f) dhd<fr n „ . . 

— — — h 7— 7— = 0, on the free surface. (1.144) 

at oy ox ox 

The first equation comes from varying within the fluid, and it simply 
confirms that the flow is incompressible, i.e. obeys divv = 0. The second 
comes from varying h, and is the Bernoulli equation stating that we have 
P = Pq (atmospheric pressure) everywhere on the free surface. The third, 
from the variation of <fi at y = 0, states that no fluid escapes through the 
lower boundary. 

Obtaining and interpreting the last equation, involving dh/dt, is some- 
what trickier. It comes from the variation of <fi on the upper boundary. The 
variation of S due to 5d> is 



dtdxdy. 



(1.145) 

The first three terms in the integrand constitute the three-dimensional di- 
vergence div (5<t>$>), where, listing components in the order t, x, y, 



$ = 



' dx' dy 



(1.146) 



The integrated-out part on the upper surface is therefore / (<3> • n)5(j)d\S\. 
Here, the outward normal is 



/ fdhV fdh\ 2 \ 1/2 [ dh dh 1 

n ={ 1+ (m) + {d-x) ) (l - l47) 
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and the element of area 



2 2 \ 1 I 2 

dh\ ( dh 



d\S\ =[!+[- +(-) ) dtdx. (1.148) 



The boundary variation is thus 



"U-/{f-S + S5}**(-^')«- <""> 

Requiring this variation to be zero for arbitrary 5(f) (x, h(x,t),t^ leads to 

dh d(f) dhdcj) 
dt dy dx dx 



This last boundary condition expresses the geometrical constraint that the 
surface moves with the fluid it bounds, or, in other words, that a fluid particle 
initially on the surface stays on the surface. To see that this is so, define 
f(x,y,t) = h(x,t) — y. The free surface is then determined by f(x,y,t) = 
0. Because the surface particles are carried with the flow, the convective 
derivative of /, 

f = f + (W)/, (1.151) 

must vanish on the free surface. Using v = V0 and the definition of /, this 
reduces to 

dh + d±dh_d± = ^ (L152) 
dt dx dx dy 



which is indeed the last boundary condition. 
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Lagrange Multipliers 




Figure 1.13 shows the contour map of a hill of height h = f(x,y). The 
hill traversed by a road whose points satisfy the equation g(x, y) = 0. Our 
challenge is to use the data h(x, y) and g(x, y) to find the highest point on 
the road. 

When r changes by dr = (dx, dy), the height / changes by 

df = Vf- dr, (1.153) 

where V/ = (d x f,d y f). The highest point, being a stationary point, will 
have df = for all displacements dr that stay on the road — that is for 
all dr such that dg = 0. Thus V/ • dr must be zero for those dr such that 
= Vg • dr. In other words, at the highest point V/ will be orthogonal to 
all vectors that are orthogonal to Vg. This is possible only if the vectors V/ 
and Vg are parallel, and so V/ = AVg for some A. 

To find the stationary point, therefore, we solve the equations 

V/-AVg = 0, 

g{x,y) = 0, (1.154) 

simultaneously. 

Example: Let f = x 2 + y 2 and g = x + y — 1. Then V/ = 2(x,y) and 
Vg = (l,l). So 

2(x,g)-A(l,l) = 0, (x,y) = ^(l,l) 
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m 



x + y = 1, =^ A = l, = (^,-). 

When there are n constraints, gi = #2 = • • • = 9n = 0, we want V/ to lie 

(< V 9i > ± ) ± =< V 9i >, (1.155) 



where < > denotes the space spanned by the vectors and < >-*- is 
the its orthogonal complement. Thus V/ lies in the space spanned by the 
vectors V#i, so there must exist n numbers A« such that 

n 

V/ = ^A,V^. (1.156) 
i=\ 

The numbers \ are called Lagrange multipliers. We can therefore regard our 
problem as one of finding the stationary points of an auxilliary function 

F = f-J2 X ^, (1-157) 

i 

with the n undetermined multipliers Aj, % — 1, . . . , n, subsequently being fixed 
by imposing the n requirements that gi — 0, i — 1, . . . , n. 
Example: Find the stationary points of 

F(x) = ix • Ax = ^XiAijXj (1.158) 

on the surface x • x = 1. Here A^ is a symmetric matrix. 
Solution: We look for stationary points of 

G(x) = F(x) - iA|x| 2 . (1.159) 
2 

The derivatives we need are 

OF 1 1 

nOki-AijXj -\- XiAijOjk 



dx k 2 "^j"3 ■ 2 

= A kjXj , (1.160) 

and 

^ (±x jXj I = Ax fc . (1.161) 



38 



CHAPTER 1. CALCULUS OF VARIATIONS 



Thus, the stationary points must satisfy 



x i x i = 1, 



(1.162) 



and so are the normalized eigenvectors of the matrix A. The Lagrange 
multiplier at each stationary point is the corresponding eigenvalue. 
Example: Statistical Mechanics. Let T denote the classical phase space of a 
mechanical system of n particles governed by Hamiltonian H(p,q). Let dT 
be the Liouville measure d 3n pd 3n q. In statistical mechanics we work with 
a probability density p{p,q) such that p(p,q)dT is the probability of the 
system being in a state in the small region dT. The entropy associated with 
the probability distribution is the functional 



We wish to find the p(p, q) that maximizes the entropy for a given energy 



We cannot vary p freely as we should preserve both the energy and the 
normalization condition 



that is required of any probability distribution. We therefore introduce two 
Lagrange multipliers, 1 + a and (3, to enforce the normalization and energy 
conditions, and look for stationary points of 




(1.163) 




(1.164) 




(1.165) 



F[p]= [ {-p\np+(a + l)p-(3pH} dT 



(1.166) 



Jr 



Now we can vary p freely, and hence find that 





Requiring this to be zero gives us 



p{p, q) = e' 



,a-/3H(p,q) 



(1.168) 
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where a, (3 are determined by imposing the normalization and energy con- 
straints. This probability density is known as the canonical distribution, and 
the parameter (5 is the inverse temperature (5 = 1/T. 

Example: The Catenary. At last we have the tools to solve the problem of 
the hanging chain of fixed length. We wish to minimize the potential energy 

E[y] = J yy/l + (y') 2 dx, (1.169) 

subject to the constraint 

l[ y ] = J y/1 + (y'fdx = const., (1.170) 

where the constant is the length of the chain. We introduce a Lagrange 
multiplier A and find the stationary points of 

F[y} = j\y-\)y/T+Wdx, (1.171) 
so, following our earlier methods, we find 

y = A + Atcosh ( x + a \ (1.172) 

K 

We choose k, A, a to fix the two endpoints (two conditions) and the length 
(one condition). 

Example: Sturm- Liouville Problem. We wish to find the stationary points 
of the quadratic functional 

fX2 1 

J[y] = J g fr^dO 2 + rt^y 2 } dx > (i.i73) 

subject to the boundary conditions y(x) = at the endpoints Xi,x 2 and the 
normalization 

K[y}= / y 2 dx = l. (1.174) 

J Xl 

Taking the variation of J — (X/2)K, we find 

rx2 

5J = {-(pyj + qy - Xy}5ydx. (1.175) 
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Stationarity therefore requires 

-(py')' + qy = \y, y( Xl ) = y(x 2 ) = 0. (1.176) 

This is the Sturm- Liouville eigenvalue problem. It is an infinite dimensional 
analogue of the F(x) = |x ■ Ax problem. 

Example: Irrotational Flow Again. Consider the action functional 

S[v,0,p] = / |ipv 2 - M (p) + 0^ + divpv^)|^ 3 a; (1.177) 

This is similar to our previous action for the irrotational barotropic flow of an 
inviscid fluid, but here v is an independent variable and we have introduced 
infinitely many Lagrange multipliers 0(x, t), one for each point of space-time, 
so as to enforce the equation of mass conservation p + div pv = everywhere, 
and at all times. Equating SS/Sv to zero gives v = V0, and so these Lagrange 
multipliers become the velocity potential as a consequence of the equations 
of motion. The Bernoulli and Euler equations now follow almost as before. 
Because the equation v = V0 does not involve time derivatives, this is 
one of the cases where it is legitimate to substitute a consequence of the 
action principle back into the action. If we do this, we recover our previous 
formulation. 



1.6 Maximum or Minimum? 



We have provided many examples of stationary points in function space. We 
have said almost nothing about whether these stationary points are maxima 
or minima. There is a reason for this: investigating the character of the 
stationary point requires the computation of the second functional derivative. 

5 2 J 



5y(x 1 )5y{x 2 ) 

and the use of the functional version of Taylor's theorem to expand about 
the stationary point y(x): 

5J 



J[ y + err] = J[y] + e J r}(x) 



+- 



5y(x 
J rj{x l )rj{x 2 ) 



dx 



5 2 J 



Sy(x 1 )5y(x 2 ) 



dx\dx 2 + 



1.178) 
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Since y(x) is a stationary point, the term with 5J/5y(x)\ y vanishes. Whether 
y(x) is a maximum, a minimum, or a saddle therefore depends on the number 
of positive and negative eigenvalues of S 2 J/S(y(xi))S(y(x2)), a matrix with 
a continuous infinity of rows and columns — these being labeled by x\ and 
x 2 repectively. It is not easy to diagonalize a continuously infinite matrix! 
Consider, for example, the functional 

J[y] = f\ {p(x)(y') 2 + q(x)y 2 } dx, (1.179) 
with y(a) = y(b) = 0. Here, as we already know, 

Ly = (p(x)-^-y(x)] +q(x)y(x), (1.180) 



Sy(x) dx \ dx 

and, except in special cases, this will be zero only if y(x) = 0. We might 
reasonably expect the second derivative to be 

j-(Ly) = L, (1.181) 

where L is the Sturm-Liouville differential operator 

L = -4-Ux)4:)+q(x). (1-182) 



dx \ dx 

How can a differential operator be a matrix like S 2 J / 5{y{xij)5(y{x2))l 

We can formally compute the second derivative by exploiting the Dirac 
delta "function" S(x) which has the property that 



y{x 2 ) = J 5(x 2 - x 1 )y(x 1 ) dx 1 . (1.183) 

Thus 

5y{x 2 ) = J 8(x 2 - xi)8y(xi) dx u (1.184) 
from which we read off that 



(1.185) 
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Using (1.185), we find that 

How are we to make sense of this expression? We begin in the next chapter 
where we explain what it means to differentiate 8(x), and show that (1.186) 
does indeed correspond to the differential operator L. In subsequent chap- 
ters we explore the manner in which differential operators and matrices are 
related. We will learn that just as some matrices can be diagonalized so can 
some differential operators, and that the class of diagonalizable operators 
includes (1.182). 

If all the eigenvalues of L are positive, our stationary point was a min- 
imum. For each negative eigenvalue, there is direction in function space in 
which J[y] decreases as we move away from the stationary point. 



1.7 Further Exercises and Problems 

Here is a collection of problems relating to the calculus of variations. Some 
date back to the 16th century, others are quite recent in origin. 

Exercise 1.1: A smooth path in the x-y plane is given by r(t) = (x(t),y(t)) 
with r(0) = a, and r(l) = b. The length of the path from a to b is therefore. 

S[r] = [ y/x 2 + y 2 dt, 
Jo 

where x = dx/dt, y = dy/dt. Write down the Euler-Lagrange conditions for 
S[r] to be stationary under small variations of the path that keep the endpoints 
fixed, and hence show that the shortest path between two points is a straight 
line. 

Exercise 1.2: Fermat's principle. A medium is characterised optically by 
its refractive index n, such that the speed of light in the medium is c/n. 
According to Fermat (1657), the path taken by a ray of light between any 
two points makes stationary the travel time between those points. Assume 
that the ray propagates in the x, y plane in a layered medium with refractive 
index n{x). Use Fermat's principle to establish Snell's law in its general form 
n(x) simp = constant by finding the equation giving the stationary paths y{x) 
for 

Fi[y] = I n{x)^l + y' 2 dx. 
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(Here the prime denotes differentiation with respect to x.) Repeat this exercise 
for the case that n depends only on y and find a similar equation for the 
stationary paths of 



By using suitable definitions of the angle of incidence ip in each case, show 
that the two formulations of the problem give physically equivalent answers. 
In the second formulation you will find it easiest to use the first integral of 
Euler's equation. 

Problem 1.3: Hyperbolic Geometry. This problem introduces a version of the 
Poincare model for the non-Euclidean geometry of Lobachevski. 

a) Show that the stationary paths for the functional 



with y(x) restricted to lying in the upper half plane are semi-circles of 
arbitrary radius and with centres on the x axis. These paths are the 
geodesies, or minimum length paths, in a space with Riemann metric 



b) Show that if we call these geodesies "lines", then one and only one line 
can be drawn though two given points. 

c) Two lines are said to be parallel if, and only if, they meet at "infinity" , 
i.e. on the (Verify that the x axis is indeed infinitely far from any 
point with y > 0.) Show that given a line q and a point A not lying on 
that line, that there are two lines passing through A that are parallel to 
q, and that between these two lines lies a pencil of lines passing through 
A that never meet q. 

Problem 1.4: Elastic Rods. The elastic energy per unit length of a bent steel 
rod is given by ^YI/R 2 . Here R is the radius of curvature due to the bending, 
Y is the Young's modulus of the steel and I = ff y 2 dxdy is the moment 
of inertia of the rod's cross section about an axis through its centroid and 
perpendicular to the plane in which the rod is bent. If the rod is only slightly 
bent into the yz plane and lies close to the z axis, show that this elastic energy 
can be approximated as 





ds 2 = —(dx 2 + dy 2 ), 



y>0. 
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where the prime denotes differentiation with respect to z and L is the length 
of the rod. We will use this approximate energy functional to discuss two 
practical problems. 



a) 




b) 




Figure 1.14: A rod used as: a) a column, b) a cantilever. 



Euler's problem: the buckling of a slender column. The rod is used as 
a column which supports a compressive load Mg directed along the z 
axis (which is vertical). Show that when the rod buckles slighly (i.e. 
deforms with both ends remaining on the z axis) the total energy, in- 
cluding the gravitational potential energy of the loading mass M, can be 
approximated by 



U[y) 



dz. 



By considering small deformations of the form 



nirz 



oo 

y~) a n sin — 

n=l 



show that the column is unstable to buckling and collapse if Mg > 

7T 2 YI/L 2 . 

b) Leonardo da Vinci's problem: the light cantilever. Here we take the z 
axis as horizontal and the y axis as being vertical. The rod is used as 
a beam or cantilever and is fixed into a wall so that y(0) = = y'(0). 
A weight Mg is hung from the end z = L and the beam sags in the — y 
direction. We wish to find y(z) for < z < L. We will ignore the weight 
of the beam itself. 

• Write down the complete expression for the energy, including the 
gravitational potential energy of the weight. 
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• Find the differential equation and boundary conditions at z = 0, L 
that arise from minimizing the total energy. In doing this take care 
not to throw away any term arising from the integration by parts. 
You may find the following identity to be of use: 

d f t i a j, a i j, ii ii t mi 
fa(f 9 -19 ) = J 9 -19 ■ 

• Solve the equation. You should find that the displacement of the 
end of the beam is y(L) = -\MgL z /YI. 

Exercise 1.5: Suppose that an elastic body f2 of density p is slightly deformed 
so that the point that was at cartesian co-ordinate Xi is moved to X\ + f]i(x). 
We define the resulting strain tensor e^ by 

_ 1 / drij drg \ 
13 2\dx i + dx j )' 

It is automatically symmetric in its indices. The Lagrangian for small- amplitude 
elastic motion of the body is 

L[v}= j^W? - ^eijdjkieki^ d 3 x. 
Here, is the tensor of elastic constants, which has the symmetries 

Cijkl = Cklij = Cjikl = Cijlfc. 

By varying the rji, show that the equation of motion for the body is 

d 2 r/i d 

where 

0~ij = C-ijkl&kl 

is the stress tensor. Show that variations of rji on the boundary dQ give as 
boundary conditions 

GijUj = 0, 

where n« are the components of the outward normal on dCl. 
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Figure 1.15: Weighted line. 



Problem 1.6:The catenary revisited. We can describe a catenary curve in 
parametric form as x(s), y(s), where s is the arc-length. The potential en- 
ergy is then simply J pgy(s)ds where p is the mass per unit length of the 
hanging chain. The x, y are not independent functions of s, however, because 
x 2 + y 2 = 1 at every point on the curve. Here a dot denotes a derivative with 
respect to s. 



a) Introduce infinitely many Lagrange multipliers A(s) to enforce the x 2 + y 2 
constraint, one for each point s on the curve. Prom the resulting func- 
tional derive two coupled equations describing the catenary, one for x(s) 
and one for y(s). By thinking about the forces acting on a small section 
of the cable, and perhaps by introducing the angle t/j where x = cos V' and 
y = sin?/;, so that s and ip are intrinsic coordinates for the curve, inter- 
pret these equations and show that X(s) is proportional to the position- 
dependent tension T(s) in the chain. 

b) You are provided with a light-weight line of length ira/2 and some lead 
shot of total mass M. By using equations from the previous part (suit- 
ably modified to take into account the position dependent p(s)) or oth- 
erwise, determine how the lead should be distributed along the line if the 
loaded line is to hang in an arc of a circle of radius a (see figure 1.15) 
when its ends are attached to two points at the same height. 



Problem 1.7: Another model for Lobachevski geometry (see exercise 1.3) 
is the Poincare disc. This space consists of the interior of the unit disc 
D 2 = {(x,y) £ M 2 : x 2 + y 2 < 1} equipped with Riemann metric 
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The geodesic paths are found by minimizing the arc-length functional 




Figure 1.16: The Poincare disc of exercise 1.7. The radius OP of the Poincare 
disc is unity, while the radius of the geodesic arc PQR is PX = QX = RX = 
R. The distance between the centres of the disc and arc is OX = xq. Your 
task in part c) is to show that ZOPX = ZORX = 90°. 

a) Either by manipulating the two Euler-Lagrange equations that give the 
conditions for s[r] to be stationary under variations in r(i), or, more effi- 
ciently, by observing that s[r] is invariant under the infinitesimal rotation 

Sx = ey 
5y = —ex 

and applying Noether's theorem, show that the parameterised geodesies 
obey 

d_ I 1 xy-yx \ = 

dt [l-xi-y* V / ^TF/ ~ ' 

b) Given a point (a, b) within D 2 , and a direction through it, show that 
the equation you derived in part a) determines a unique geodesic curve 
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passing through (a, b) in the given direction, but does not determine the 
parametrization of the curve. 

Show that there exists a solution to the equation in part a) in the form 



x(t) 

y(t) 



R cos t + xq 
i?sint. 



Find a relation between xq and R, and from it deduce that the geodesies 
are circular arcs that cut the bounding unit circle (which plays the role 
of the line at infinity in the Lobachevski plane) at right angles. 

Exercise 1.8: The Lagrangian for a particle of charge q is 

L[x, x] = imx 2 — g0(x) + qx ■ A(x). 

Show that Lagrange's equation leads to 

mx = q(E + x x B), 

where 

dA 

E = -V^-— , B = curlA. 



Exercise 1.9: Consider the action functional 

S[u>, p, r] = J Qiiw? + \h<A + \hu\ + p-{r + uxr) 



dt, 



where r and p are time-dependent three- vectors, as is u = UJ2, ^3), Apply 
the action principle to obtain the equations of motion for r, p, u and show 
that they lead to Euler's equations 

hui - (h - h)^2^3 = 0, 

I3C03 - (h - I 2 )ujiuj2 = 0. 
governing the angular velocity of a freely-rotating rigid body. 

Problem 1.10: Piano String. A elastic piano string can vibrate both trans- 
versely and longitudinally, and the two vibrations influence one another. A 
Lagrangian that takes into account the lowest-order effect of stretching on the 
local string tension, and can therefore model this coupled motion, is 



to d£ 1 / drf\' 
A dx 2\dx) 
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Figure 1.17: Vibrating piano string . 

Here £(x,t) is the longitudinal displacement and rj(x,t) the transverse dis- 
placement of the string. Thus, the point that in the undisturbed string had 
co-ordinates [x,0] is moved to the point with co-ordinates [x + £(x, t),r](x,t)]. 
The parameter tq represents the tension in the undisturbed string, A is the 
product of Young's modulus and the cross-sectional area of the string, and po 
is the mass per unit length. 

a) Use the action principle to derive the two coupled equations of motion, 

. , . d 2 i . . d 2 rj 

one involving and one involving tttt. 

at at 1 

b) Show that when we linearize these two equations of motion, the longi- 
tudinal and transverse motions decouple. Find expressions for the lon- 
gitudinal (cl) and transverse (ct) wave velocities in terms of To, po and 
A. 

c) Assume that a given transverse pulse r](x,t) = rjo(x — cxt) propagates 
along the string. Show that this induces a concurrent longitudinal pulse 
of the form £(x — eyt). Show further that the longitudinal Newtonian 
momentum density in this concurrent pulse is given by 



1 



Po dt ~ 2 c? 



H L 1 



where 



drj drj 
^° dx dt 



is the associated pseudo-momentum density. 

The forces that created the transverse pulse will also have created other lon- 
gitudinal waves that travel at cl- Consequently the Newtonian x-momentum 
moving at ct is not the only ^-momentum on the string, and the total "true" 
longitudinal momentum density is not simply proportional to the pseudo- 
momentum density. 
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Exercise 1.11: Obtain the canonical energy-momentum tensor T v M for the 
barotropic fluid described by (1.119). Show that its conservation leads to both 
the momentum conservation equation (1.128), and to the energy conservation 
equation 

dtS + di{vi(S + P)}, 

where the energy density is 

Interpret the energy flux as being the sum of the convective transport of energy 
together with the rate of working by an element of fluid on its neighbours. 

Problem 1.12: Consider the action functional 3 

S[v,p, 0, (3, 7 ] = J d A x | - \pv 2 - <j> (J? + div (pv)) + p(3 + (v ■ V) 7 ) + u(p) } 

which is a generalization of (1.177) to include two new scalar fields (3 and 7. 
Show that varying v leads to 

v = V<j) + /3V7. 

This is the Clebsch representation of the velocity field. It allows for flows with 
non-zero vorticity 

u> = curlv = V/3 x V7. 

Show that the equations that arise from varying the remaining fields p, <fi, (3, 
7, together imply the mass conservation equation 

^ + div(pv) = 0, 
and Bernoulli's equation in the form 

^ + u,xv = -vQv* + * 

(Recall that h = du/dp.) Show that this form of Bernoulli's equation is 
equivalent to Euler's equation 

— + (v • V)v = -Vh. 

Consequently S provides an action principle for a general inviscid barotropic 
flow. 



3 H. Bateman, Proc. Roy. Soc. Lond. A 125 (1929) 598-618; C. C. Lin, Liquid Helium 
in Proc. Int. Sch. Phys. "Enrico Fermi", Course XXI (Academic Press 1965). 
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Exercise 1.13: Drums and Membranes. The shape of a distorted drumskin is 
described by the function h(x,y), which gives the height to which the point 
(x, y) of the flat undistorted drumskin is displaced. 

a) Show that the area of the distorted drumskin is equal to 



Area[/ t ]=/dx^l+(|) + (g 

where the integral is taken over the area of the flat drumskin. 
b) Show that for small distortions, the area reduces to 



A[h] = const. + i J dxdy\Vh\' 



c) Show that if h satisfies the two-dimensional Laplace equation then A is 
stationary with respect to variations that vanish at the boundary. 

d) Suppose the drumskin has mass po per unit area, and surface tension T. 
Write down the Lagrangian controlling the motion of the drumskin, and 
derive the equation of motion that follows from it. 

Problem 1.14: The Wulff construction. The surface-area functional of the 
previous exercise can be generalized so as to find the equilibrium shape of a 
crystal. We describe the crystal surface by giving its height z(x, y) above the 
x-y plane, and introduce the direction-dependent surface tension (the surface 
free-energy per unit area) a(p,q), where 

dz dz 
P= dx-> q= dy- W 

We seek to minimize the total surface free energy 

F[z] = J dxdy |a(p, q)\/l + p 2 + <? 2 } , 
subject to the constraint that the volume of the crystal 

V[z]=Jzdxdy 

remains constant. 

a) Enforce the volume constraint by introducing a Lagrange multiplier 2A 
and so obtain the Euler-Lagrange equation 

dx ( dp ) dy ( dq ) 



-i 
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Here 



f(p,q) = a{p.q)y/\ + p 2 + q 2 . 



b) Show in the isotropic case, where a is constant, that 



z(x,y) = \J (aX) 2 — (x — a) 2 — {y — b) 2 + const. 



is a solution of the Euler-Lagrange equation. In this case, therefore, the 
equilibrium shape is a sphere. 

An obvious way to satisfy the Euler-Lagrange equation in the general anisotropic 
case would be to arrange things so that 



c) Show that (**) is exactly the relationship we would have if z(x, y) and 
\f(p>(?) were Legendre transforms of each other — i.e. if 



where the x and y on the right-hand side are functions of p q obtained 
by solving (*). Do this by showing that the inverse relation is 



where now the p, q on the right-hand side become functions of x and y, 
and are obtained by solving (**). 

For real crystals, a(p,q) can have the property of a being a continuous-but- 
nowhere-differentiable function, and so the differential calculus used in deriv- 
ing the Euler-Lagrange equation is inapplicable. The Legendre transformation, 
however, has a geometric interpretation that is more robust than its calculus- 
based derivation. 

Recall that if we have a two-parameter family of surfaces in M 3 given by 
F(x,y, z;p,q) = 0, then the equation of the envelope of the surfaces is found 
by solving the equations 

„ dF dF 

= F = = 

dp dq 

so as to eliminate the parameters p, q. 



x = A 




*f(p,q) =px + qy - z(x,y) 



z(x, y)=px + qy- Xf(p, q) 
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d) Show that the equation 

F(x,y,z;p,q) = px + qy - z - Xa(p, q) y 1 + p 2 + q 2 = 
describes a family of planes perpendicular to the unit vectors 

n = — = 

sj 1 + p 2 + g 2 

and at a distance Xa(p, q) away from the origin. 

e) Show that the equations to be solved for the envelope of this family of 
planes are exactly those that determine z(x,y). Deduce that, for smooth 
a(p,q), the profile z(x,y) is this envelope. 




Figure 1.18: Two-dimensional Wulff crystal, a) Polar plot of surface tension 
a as a function of the normal n to a crystal face, together with a line per- 
pendicular to n at distance a from the origin, b) Wulff 's construction of the 
corresponding crystal surface as the envelope of the family of perpendicular 
lines. In this case, the minimum-energy crystal has curved faces, but sharp 
corners. The envelope continues beyond the corners, but these parts are 
unphysical. 

Wulff conjectured 4 that, even for non-smooth a(p,q), the minimum-energy 
shape is given by an equivalent geometric construction: erect the planes from 
part d) and, for each plane, discard the half-space of M 3 that lies on the far side 
of the plane from the origin. The convex region consisting of the intersection 
of the retained half-spaces is the crystal. When a(p, q) is smooth this 'Wulff 



4 G. WulfT, "Zur frage der geschwindigkeit des wachsturms under auflosung der 
kristallflachen," Zeitschrift fur Kristallografie, 34 (1901) 449-530. 
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body" is bounded by part of the envelope of the planes. (The parts of the 
envelope not bounding the convex body — the "swallowtails" visible in figure 
1.18 — are unphysical.) When a(p.q) has cusps, these singularities can give 
rise to flat facets which are often joined by rounded edges. A proof of Wulff's 
claim had to wait until 43 years until 1944, when it was established by use of 
the Brunn-Minkowski inequality. 5 



5 A. Dinghas, "Ubcr cincn gcometrischcn Satz von Wulff fur die Gleichgewichtsform 
von Kristallen, Zeitshrift fur Kristallografie, 105 (1944) 304-314. For a readable modern 
account see: R. Gardner, "The Brunn-Minkowski inequality," Bulletin Amer. Math. Soc. 
39 (2002) 355-405. 



Chapter 2 
Function Spaces 



Many differential equations of physics are relations involving linear differ- 
ential operators. These operators, like matrices, are linear maps acting on 
vector spaces. The new feature is that the elements of the vector spaces are 
functions, and the spaces are infinite dimensional. We can try to survive 
in these vast regions by relying on our experience in finite dimensions, but 
sometimes this fails, and more sophistication is required. 



on the surface x ■ x = 1. This led to the matrix eigenvalue equation 



2.1 Motivation 



In the previous chapter we considered two variational problems: 
1) Find the stationary points of 





Ax = Ax. 



(2.2) 



2) Find the stationary points of 




(2.3) 



subject to the conditions y(a) — y(b) — and 




(2.4) 
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This led to the differential equation 

-(py'y + qy = \y, y(a) = y(b) = 0. (2.5) 

There will be a solution that satisfies the boundary conditions only for 

a discrete set of values of A. 
The stationary points of both function and functional are therefore deter- 
mined by linear eigenvalue problems. The only difference is that the finite 
matrix in the first is replaced in the second by a linear differential operator. 
The theme of the next few chapters is an exploration of the similarities and 
differences between finite matrices and linear differential operators. In this 
chapter we will focus on how the functions on which the derivatives act can 
be thought of as vectors. 



2.1.1 Functions as vectors 

Consider F[a, b], the set of all real (or complex) valued functions f(x) on the 
interval [a, b]. This is a vector space over the field of the real (or complex) 
numbers: Given two functions f\(x) and f'2{x), and two numbers Ai and A2, 
we can form the sum Xif\(x) + A2/2OE) and the result is still a function on the 
same interval. Examination of the axioms listed in appendix A will show that 
F[a,b] possesses all the other attributes of a vector space as well. We may 
think of the array of numbers (f(x)) for x G [a, b] as being the components 
of the vector. Since there is an infinity of independent components — one 
for each point x — the space of functions is infinite dimensional. 

The set of all functions is usually too large for us. We will restrict our- 
selves to subspaces of functions with nice properties, such as being continuous 
or differentiable. There is some fairly standard notation for these spaces: The 
space of C n functions (those which have n continuous derivatives) is called 
C n [a, b]. For smooth functions (those with derivatives of all orders) we write 
C°°[a,b}. For the space of analytic functions (those whose Taylor expansion 
actually converges to the function) we write C u [a, b]. For C°° functions de- 
fined on the whole real line we write C°°(R). For the subset of functions 
with compact support (those that vanish outside some finite interval) we 
write Cq°(R). There are no non-zero analytic functions with compact sup- 
port: C$(R) = {0}. 
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2.2 Norms and Inner Products 

We are often interested in "how large" a function is. This leads to the idea of 
normed function spaces. There are many measures of function size. Suppose 
R(t) is the number of inches per hour of rainfall. If your are a farmer you 
are probably most concerned with the total amount of rain that falls. A big 
rain has big J \R(t) \ dt. If you are the Urbana city engineer worrying about 
the capacity of the sewer system to cope with a downpour, you are primarily 
concerned with the maximum value of R(t). For you a big rain has a big 
"sup | R(t) I." 1 

2.2.1 Norms and convergence 

We can seldom write down an exact solution function to a real-world problem. 
We are usually forced to use numerical methods, or to expand as a power 
series in some small parameter. The result is a sequence of approximate 
solutions f n {x), which we hope will converge to the desired exact solution 
f(x) as we make the numerical grid smaller, or take more terms in the power 
series. 

Because there is more than one way to measure of the "size" of a function, 
the convergence of a sequence of functions f n to a limit function / is not as 
simple a concept as the convergence of a sequence of numbers x n to a limit x. 
Convergence means that the distance between the f n and the limit function 
/ gets smaller and smaller as n increases, so each different measure of this 
distance provides a new notion of what it means to converge. We are not go- 
ing to make much use of formal "e, 5" analysis, but you must realize that this 
distinction between different forms of convergence is not merely academic: 
real-world engineers must be precise about the kind of errors they are pre- 
pared to tolerate, or else a bridge they design might collapse. Graduate-level 
engineering courses in mathematical methods therefore devote much time to 
these issues. While physicists do not normally face the same legal liabilities 
as engineers, we should at least have it clear in our own minds what we mean 
when we write that /„ — > /. 

1 Here "sup," short for supremum, is synonymous with the "least upper bound" of a 
set of numbers, i.e. the smallest number that is exceeded by no number in the set. This 
concept is more useful than "maximum" because the supremum need not be an element 
of the set. It is an axiom of the real number system that any bounded set of real numbers 
has a least upper bound. The "greatest lower bound" is denoted "inf", for infimum. 
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are some common forms of convergence: 

If, for each x in its domain of definition V, the set of numbers f n (x) 
converges to f(x), then we say the sequence converges pointwise. 
If the maximum separation 

sup|/ n (x)-/(x)| (2.6) 
xeT> 

goes to zero as n — > oo, then we say that f n converges to / uniformly 
on T>. 
iii) If 

/ \f n {x)-f{x)\dx (2.7) 
Jv 

goes to zero as n — > oo, then we say that /„ converges in the mean to 
/ on V. 

Uniform convergence implies pointwise convergence, but not vice versa. If 
D is a finite interval, then uniform convergence implies convergence in the 
mean, but convergence in the mean implies neither uniform nor pointwise 
convergence. 

Example: Consider the sequence f n — x n {n — 1, 2, . . .) and T) — [0, 1). 
Here, the round and square bracket notation means that the point x = is 
included in the interval, but the point 1 is excluded. 



'1 








xV 


y / 






>^ 3 



1 

Figure 2.1: x n — ► on [0, 1), but not uniformly. 

As n becomes large we have x n — > pointwise in T>, but the convergence is 
not uniform because 

sup \x n - 0| = 1 (2.8) 

xev 

for all n. 



Here 
i) 

ii) 
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Example: Let f n = x n with T> = [0, 1]. Now the the two square brackets 
mean that both x = and x = 1 are to be included in the interval. In this 
case we have neither uniform nor pointwise convergence of the x n to zero, 
but x n — > in the mean. 

We can describe uniform convergence by means of a norm — a general- 
ization of the usual measure of the length of a vector. A norm, denoted by 
ll/H, of a vector / (a function, in our case) is a real number that obeys 

i) positivity: ll/H > 0, and ||/|| = & f = 0, 

ii) the triangle inequality: \\f + g\\ < ||/|| + ||g||, 

iii) linear homogeneity: ||A/|| = |A| ||/||. 

One example is the "sup" norm, which is defined by 

11/1100 = sup (2.9) 

x£T> 

This number is guaranteed to be finite if / is continuous and V is compact. 
In terms of the sup norm, uniform convergence is the statement that 

lim ||/ n -/|| oo = . (2.10) 



2.2.2 Norms from integrals 

The space L p [a, b], for any 1 < p < oo, is defined to be our F[a, b] equipped 
with ^ 

\\f\\p=(£\f(*)\ p d*) \ (2-11) 

as the measure of length, and with a restriction to functions for which ||/|| p 
is finite. 

We say that f n — > / in L p if the LP distance ||/ — /„|| p tends to zero. We 
have already seen the L 1 measure of distance in the definition of convergence 
in the mean. As in that case, convergence in L p says nothing about pointwise 
convergence. 

We would like to regard ||/|| p as a norm. It is possible, however, for a 
function to have ||/|| p = without / being identically zero — a function 
that vanishes at all but a finite set of points, for example. This pathology 
violates number i) in our list of requirements for something to be called a 
norm, but we circumvent the problem by simply declaring such functions to 
be zero. This means that elements of the L p spaces are not really functions, 
but only equivalence classes of functions — two functions being regarded as 



60 



CHAPTER 2. FUNCTION SPACES 



the same is they differ by a function of zero length. Clearly these spaces are 
not for use when anything significant depends on the value of the function 
at any precise point. They are useful in physics, however, because we can 
never measure a quantity at an exact position in space or time. We usually 
measure some sort of local average. 

The LP norms satisfy the triangle inequality for all 1 < p < oo, although 
this is not exactly trivial to prove. 

An important property for any space to have is that of being complete. 
Roughly speaking, a space is complete if when some sequence of elements of 
the space look as if they are converging, then they are indeed converging and 
their limit is an element of the space. To make this concept precise, we need 
to say what we mean by the phrase "look as if they are converging." This 
we do by introducing the idea of a Cauchy sequence. 

Definition: A sequence f n in a normed vector space is Cauchy if for any e > 
we can find an N such that n,m > N implies that \\f m — f n \\ < e. 
This definition can be loosely paraphrased to say that the elements of a 
Cauchy sequence get arbitrarily close to each other as n — > oo. 

A normed vector space is complete with respect to its norm if every 
Cauchy sequence actually converges to some element in the space. Consider, 
for example, the normed vector space Q of rational numbers with distance 
measured in the usual way as \\qi — q 2 \\ = \qi — q 2 \- The sequence 

g = 1.0, 
5i = 1-4, 

g 2 = 1-41, 

g 3 = 1-414, 



consisting of successive decimal approximations to \/2, obeys 

\q n -q m \ < 10min(nim) ( 2 -l 2 ) 

and so is Cauchy. Pythagoras famously showed that a/2 is irrational, however, 
and so this sequence of rational numbers has no limit in Q. Thus Q is not 
complete. The space R of real numbers is constructed by filling in the gaps 
between the rationals, and so completing Q. A real number such as \/2 
is defined as a Cauchy sequence of rational numbers (by giving a rule, for 
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example, that determines its infinite decimal expansion), with two rational 
sequences q n and q' n defining the same real number if q n — q' n converges to 
zero. 

A complete normed vector space is called a Banach space. If we interpret 
the norms as Lebesgue integrals 2 then the L p [a, b] are complete, and therefore 
Banach spaces. The theory of Lebesgue integration is rather complicated, 
however, and is not really necessary. One way of avoiding it is explained in 
exercise 2.2. 

Exercise 2. 1 : Show that any convergent sequence is Cauchy. 

2.2.3 Hilbert space 

The Banach space L 2 [a,b] is special in that it is also a Hilbert space. This 
means that its norm is derived from an inner product. If we define the inner 
product 

{f,g)= [ f*gdx (2.13) 

J a 

then the L 2 [a, b] norm can be written 

ll/lh = y/UJ). (2-14) 

When we omit the subscript on a norm, we mean it to be this one. You 
are probably familiar with this Hilbert space from your quantum mechanics 
classes. 

Being positive definite, the inner product satisfies the Cauchy- Schwarz- 
Bunyakovsky inequality 

\(f,g)\< WfWhl (2.15) 

That this is so can be seen by observing that 

(\f + ng,\f + ng) = (\; ^)( ( ![{D). N| 2> )(^)' (2 ' 16) 

must be non-negative for any choice of A and //. We therefore select A = \\g\\, 
fj, = — (/, ^nigH -1 , in which case the non-negativity of (2.16) becomes the 
statement that 

WjT\\g\\ 2 - \(f,g)\ 2 > 0. (2.17) 

2 The "L" in L p honours Henri Lebesgue. Banach spaces are named after Stefan Banach, 
who was one of the founders of functional analysis, a subject largely developed by him 
and other habitues of the Scottish Cafe in Lvov, Poland. 
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From Cauchy-Schwarz-Bunyakovsky we can establish the triangle inequal- 
ity: 

\\f + gf = ||/|| 2 + |M| 2 + 2Re</, ff ) 

< ll/H 2 + Nr + 2|</,2>|, 

< ll/ll 2 + NI 2 + 2||/||NI, 

+ IMI) 2 , (2-18) 



so 

\\f + g\\< 11/11 + NI- (2-19) 

A second important consequence of Cauchy-Schwarz-Bunyakovsky is that 
if fn — > f in the sense that ||/ n — /|| — > 0, then 

\(fn,g)-(f,g)\ = \((fn-f),g}\ 

< ll/n-/|||MI (2.20) 

tends to zero, and so 

^ </,<?>■ (2-21) 

This means that the inner product (/, g) is a continuous functional of / and 
g. Take care to note that this continuity hinges on ||g|| being finite. It is for 
this reason that we do not permit ||g|| = oo functions to be elements of our 
Hilbert space. 



Orthonormal sets 

Once we are in possession of an inner product, we can introduce the notion 
of an orthonormal set. A set of functions {u n } is orthonormal if 

(u n ,u m ) = 5 nm . (2.22) 

For example, 

2 / sin(n7rx) sin(m7ra) dx = 5 nm , n, m — 1, 2, . . . (2.23) 
Jo 

so the set of functions u n = \/2 sinn7rx is orthonormal on [0, 1]. This set of 
functions is also complete — in a different sense, however, from our earlier 
use of this word. A orthonormal set of functions is said to be complete if any 
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function / for which ||/|| 2 is finite, and hence / an element of the Hilbert 
space, has a convergent expansion 



oo 

f(x) = ^2a n u n (x). 

n=0 



If we assume that such an expansion exists, and that we can freely interchange 
the order of the sum and integral, we can multiply both sides of this expansion 
by -u^(x), integrate over x, and use the orthonormality of the -u ra 's to read 
off the expansion coefficients as a n = (u n , f). When 

ll/l| 2 = f\m?dx (2.24) 
Jo 

and u n = v^2 sin(n7ra;), the result is the half-range sine Fourier series. 

Example: Expanding unity. Suppose f(x) = 1. Since \f\ 2 dx = 1 is 
finite, the function f(x) = 1 can be represented as a convergent sum of the 
u n = \/2 sin(n7ra). 

The inner product of / with the w n 's is 



/ v^si 
Jo 



sin(n7ra:) dx 




Thus, 



oo ^ 

1 = E7^Tw sin ((^ +1 H' in L2 M- 

n=0 ^ ' 



(2.25) 



It is important to understand that the sum converges to the left-hand side 
in the closed interval [0, 1] only in the L 2 sense. The series does not converge 
pointwise to unity ata: = 0ora;=l — every term is zero at these points. 
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Figure 2.2: The sum of the first 31 terms in the sine expansion of f(x) = I. 

Figure 2.2 shows the sum of the series up to and including the term with 
n = 30. The L 2 [0, 1] measure of the distance between f(x) = 1 and this sum 



We can make this number as small as we desire by taking sufficiently many 
terms. 

It is perhaps surprising that a set of functions that vanish at the end- 
points of the interval can be used to expand a function that does not vanish 
at the ends. This exposes an important technical point: Any finite sum of 
continuous functions vanishing at the endpoints is also a continuous function 
vanishing at the endpoints. It is therefore tempting to talk about the "sub- 
space" of such functions. This set is indeed a vector space, and a subset of 
the Hilbert space, but it is not itself a Hilbert space. As the example shows, 
a Cauchy sequence of continuous functions vanishing at the endpoints of an 
interval can converge to a continuous function that does not vanish there. 
The "subspace" is therefore not complete in our original meaning of the term. 
The set of continuous functions vanishing at the endpoints fits into the whole 
Hilbert space much as the rational numbers fit into the real numbers: A fi- 
nite sum of rationals is a rational number, but an infinite sum of rationals 
is not in general a rational number and we can obtain any real number as 
the limit of a sequence of rational numbers. The rationals Q are therefore 
a dense subset of the reals, and, as explained earlier, the reals are obtained 
by completing the set of rationals by adding to this set its limit points. In 
the same sense, the set of continuous functions vanishing at the endpoints is 



is 




(2.26) 
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a dense subset of the whole Hilbert space and the whole Hilbert space is its 
completion. 

Exercise 2.2: In this technical exercise we will explain in more detail how 
we "complete" a Hilbert space. The idea is to mirror the construction to 
the real numbers and define the elements of the Hilbert space to be Cauchy 
sequences of continuous functions. To specify a general element of L 2 [a,b] 
we must therefore exhibit a Cauchy sequence f n £ C[a,b]. The choice is not 
unique: two Cauchy sequences fn\x) and fn\x) will specify the the same 
element if 

lim ||/W-4 2) II=0. 

n— »oo 

Such sequences are said to be equivalent. For convenience, we will write 
"lim n ^oo /„ = /" but bear in mind that, in this exercise, this means that 
the sequence f n defines the symbol /, and not that / is the limit of the se- 
quence, as this limit need have no prior existence. We have deliberately written 
"/", and not "/(x)", for the "limit function" to warn us that / is assigned no 
unique numerical value at any x. A continuous function f(x) can still be con- 
sidered to be an element of L 2 [a,b] — take a sequence in which every f n (x) is 
equal to f{x) — but an equivalent sequence of f n {x) can alter the limiting f(x) 
on a set of measure zero without changing the resulting element / £ L 2 [a, b]. 

i) If /„ and g n are Cauchy sequences defining f,g, respectively, it is natural 
to try to define the inner product (/, g) by setting 

(f,g) = lim (f n ,9n)- 

n— >oo 

Use the Cauchy-Schwarz-Bunyakovsky inequality to show that the num- 
bers F n = (f n ,9n) form a Cauchy sequence in C. Since C is complete, 
deduce that this limit exists. Next show that the limit is unaltered if 
either /„ or g n is replaced by an equivalent sequence. Conclude that our 
tentative inner product is well defined. 

ii) The next, and harder, task is to show that the "completed" space is 
indeed complete. The problem is to show that given a Cauchy sequence 
fk £ L 2 [a, b], where the fk are not necessarily in C[a,b], has a limit 
in L 2 [a,b]. Begin by taking Cauchy sequences fki £ C[a, b] such that 
lim^oo fki = fk- Use the triangle inequality to show that we can select 
a subsequence fk,i(k) that is Cauchy and so defines the desired limit. 

Later we will show that the elements of L 2 [a, b] can be given a concrete meaning 
as distributions. 
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Best approximation 

Let u n (x) be an orthonormal set of functions. The sum of the first N terms of 
the Fourier expansion of f(x) in the u n , is the closest — measuring distance 
with the L 2 norm — that one can get to / whilst remaining in the space 
spanned by ui, u 2 , ■ ■ ■ , u^. 

To see this, consider the square of the error- distance: 

N N N 

def II j. n 2 



A = \\f -^2 a nU n \\ 2 = (f -^2a m u m , f -^2a n u n ) 

1 m=l n=l 

N N N 

^2a n (f,u n ) -^2a* m (u m ,f) + a m a "n\ u mi u n) 

n=l m=l n,m=l 

N N N 



2 _ 

n=l m=l n=l 



In the last line we have used the orthonormality of the u n . We can complete 
the squares, and rewrite A as 



N N 

2 



-J2\(Un,f)\ 2 + J2\ a "-( U nJ)\ 2 - (2-28) 



n=l n=l 



We seek to minimize A by a suitable choice of coefficients a n . The smallest 
we can make it is 



N 



A min = ||/|| 2 -£|(u n ,/>| 2 , (2-29) 



n=l 



and we attain this bound by setting each of the \a n — (u n , f)\ equal to zero. 
That is, by taking 

a n = {u n J). (2.30) 

Thus the Fourier coefficients (u n , f) are the optimal choice for the a n . 

Suppose we have some non- orthogonal collection of functions g n , n = 
1, . . . N, and we have found the best approximation X^=i a n9n{x) to f(x). 
Now suppose we are given a gN+i to add to our collection. We may then seek 
an improved approximation ^2^=i a ' n 9n( x ) by including this new function - 
but finding this better fit will generally involve tweaking all the a n , not just 
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trying different values of ajv+i- The great advantage of approximating by 
orthogonal functions is that, given another member of an ortho normal family, 
we can improve the precision of the fit by adjusting only the coefficient of the 
new term. We do not have to perturb the previously obtained coefficients. 

Parseval's theorem 

The "best approximation" result from the previous section allows us to give 
an alternative definition of a "complete ortho normal set," and to obtain the 
formula a n = (u n , f) for the expansion coefficients without having to assume 
that we can integrate the infinite series ^2 a nU n term-by-term. Recall that 
we said that a set of points S is a dense subset of a space T if any given 
point x G T is the limit of a sequence of points in S, i.e. there are elements 
of S lying arbitrarily close to x. For example, the set of rational numbers Q 
is a dense subset of K. Using this language, we say that a set of orthonormal 
functions {u n (x)} is complete if the set of all finite linear combinations of 
the u n is a dense subset of the entire Hilbert space. This guarantees that, by 
taking N suflicently large, our best approximation will approach arbitrarily 
close to our target function f(x). Since the best approximation containing 
all the u n up to mat is the TV-th partial sum of the Fourier series, this shows 
that the Fourier series actually converges to /. 

We have therefore proved that if we are given u n (x), n = 1,2,..., a 
complete orthonormal set of functions on [a, b], then any function for which 
|| /|| 2 is finite can be expanded as a convergent Fourier series 



oo 




(2.31) 



n=l 



where 




(2.32) 



The convergence is guaranteed only in the L 2 



sense that 




(2.33) 



Equivalently 



A? 




(2.34) 



n=l 
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as N — > oo. Now, we showed in the previous section that 

N 

= ||/|| 2 -£|K,/>| 2 

71=1 

JV 

= ii/ir-£ki 2 , ( 2 - 35 ) 

n=l 

and so the L 2 convergence is equivalent to the statement that 

oo 

= ^M 2 . (2.36) 



n=l 

This last result is called Parseval's theorem. 

Example: In the expansion (2.25), we have ||/ 2 || = 1 and 

2= fS/^Tr 2 ), nodd, 
^ 0, n even. 

Parseval therefore tells us tells us that 

°° 1 11 2 

£(2^TT? = 1 + 35 + 55 + "' = Y' (2 ' 38 > 



Example: The functions «„(x) = -7==e mx , n G Z form a complete orthonor- 



mal set on the interval [— Let f(x) = -^^ x - Then its Fourier expan- 
sion is 

1 °° 1 

e'O = V c n -=e^, -7T < x < vr, (2.39) 

V27T V27T 
v n=— oo v 

where 



1 r sin(7r(C-n)) 



c n = — / e Kx e~ mx dx = V v (2.40) 
2tt J_ w tt(C - n) 



We also have that 

1 



— dx = 1. (2.41) 

27T 



Now Parseval tells us that 



2 = E 3i "^ (C ': )> . (2.42) 

ra=— oo 



2.2. NORMS AND INNER PRODUCTS 



69 



the left hand side being unity. 

Finally, as sm 2 (n(( — n)) — sin 2 (7r£), we have 

1 °° i 

cosec 2 (7rC) = , 2 . = V -^y- ^. (2.43) 

sm 2 (7rC) n f^vr2(C-n)2 

The end result is a quite non-trivial expansion for the square of the cosecant. 
2.2.4 Orthogonal polynomials 

A useful class of orthonormal functions are the sets of orthogonal polynomials 
associated with an interval [a, b] and a positive weight function w(x) such 
that f w(x)dx is finite. We introduce the Hilbert space L 2 [a,6] with the 
real inner product 

f b 

(u,v) w = / w(x)u(x)v(x) dx, (2.44) 

J a 

and apply the Gram- Schmidt procedure to the monomial powers 1, x, x 2 , x 3 , . . . 
so as to produce an orthonomal set. We begin with 

P (x) = l/\\l\\ w , (2.45) 



where ||1||«, = y J^w(x) dx, and define recursively 

p +1 [x) = x ^ n ^ ~ Pj(. x ){Pji x Pn) w (2 46) 

Clearly P n (x) is an n-th order polynomial, and by construction 

(Pn,Pm) w = S nm . (2.47) 

All such sets of polynomials obey a three-term recurrence relation 

xP n {x) = b n P n+1 (x) + a n P n {x) + 6 n _iP n _i(a;). (2.48) 

That there are only three terms, and that the coefficients of P n+1 and P n -\ 
are related, is due to the identity 

(P n ,xP m ) w = (xP n ,P m ) w . (2.49) 
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This means that the matrix (in the P n basis) representing the operation of 
multiplication by x is symmetric. Since multiplication by x takes us from 
P n only to P n +i, the matrix has just one non-zero entry above the main 
diagonal, and hence, by symmetry, only one below. 

The completeness of a family of polynomials orthogonal on a finite interval 
is guaranteed by the Weierstrass approximation theorem which asserts that 
for any continuous real function f(x) on [a, b], and for any e > 0, there exists 
a polynomial p(x) such that \f(x) — p(x)\ < e for all x G [a, b]. This means 
that polynomials are dense in the space of continuous functions equipped 
with the || ... Hoc norm. Because \f(x) — p(x)\ < e implies that 

b pb 

\f(x)-p(x)\ 2 w(x)dx<e 2 w(x)dx, (2.50) 

J a 

they are also a dense subset of the continuous functions in the sense of L^[a, b] 
convergence. Because the Hilbert space L^[a, 6] is defined to be the comple- 
tion of the space of continuous functions, the continuous functions are auto- 
matically dense in L^a, b}. Now the triangle inequality tells us that a dense 
subset of a dense set is dense in the larger set, so the polynomials are dense in 
14 [a, b] itself. The normalized orthogonal polynomials therefore constitute a 
complete orthonormal set. 

For later use, we here summarize the properties of the families of polyno- 
mials named after Legendre, Hermite and Tchebychef. 



/ 



Legendre polynomials 

Legendre polynomials have a = — 1, b = 1 and w — 1. The standard Legendre 
polynomials are not normalized by the scalar product, but instead by setting 
P n (l) = 1. They are given by Rodriguez' formula 

P n (x) = -L^-( x 2 - 1)™. (2.51) 

The first few are 



P (x) = 1, 
Pi(x) = X, 

P 2 (x) = i(3x 2 -l), 
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P 3 (x) = -(5x 3 -3x), 

P 4 (x) = -(35x 4 - 30x 2 + 3). 
8 



Their inner product is 



P n (x)P m (x) dx = 2n+1 



The three-term recurrence relation is 

(2n + l)xP n (x) = (n + l)P n+ i(x) + nP n ^(x). 
The P n form a complete set for expanding functions on [—1,1]. 



(2.52) 



(2.53) 



Hermite polynomials 

The Hermite polynomials have a = — oo, b = +oo and w(rr) = e~ x2 , and are 
defined by the generating function 



OO 

e 2 tx -e = -H n {x)t\ 

■ ' n! 



If we write 



ra=0 



g 2te-t 2 _ g a; 2 -(a:-t) 2 



we may use Taylor's theorem to find 



Hn ( x ) = ** 2 -^) 2 
y ' dt n 



t=o 



.. d n 
dx n ' 



(2.54) 



(2.55) 



(2.56) 



which is a a useful alternative definition. The first few Hermite polynomials 

are 



H (x 
H^x 
H 2 (x 
H 3 (x 
HAx 



1, 

2x, 

Ax 2 - 2 

8x 3 - 12a;, 

16x 4 - 48x 2 + 12, 
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The normalization is such that 

H n (x)H m (x)e- x2 dx = 2 n n\V^5 nm , (2.57) 



f 



as may be proved by using the generating function. The three-term recur- 
rence relation is 

2xH n (x) = H n+1 (x) + 2nH n _ 1 (x). (2.58) 
Exercise 2.3: Evaluate the integral 

/oo 
e -x 2 e 2sxs 2 e 2tx-t* dx 

and expand the result as a double power series in s and t. By examining the 
coefficient of s n t m , show that 

/oo 
H n (x)H m (x)e~ x2 dx = 2 n n\^5 nm . 
-oo 

Problem 2.4: Let 

<Pn{x) = H n (x)e- x2 / 2 

be the normalized Hermite functions. They form a complete orthonormal set 
in L 2 (R). Show that 

A„ /x /N 1 Uxyt- (x 2 + y 2 )(l + t 2 )) 

> t n <p n (x)<p n (y) = —== exp { — K . y ' y ' \, 0<t<l. 

This is Mehler's formula. (Hint: Expand of the right hand side as Yl^Lo a n(x, t)ip n (y). 
To find a n (x,t), multiply by e 2sy ~ s2 ~ y2 ^ 2 and integrate over y.) 

Exercise 2.5: Let <p n (x) be the same functions as in the preceding problem. 
Define a Fourier-transform operator F : L 2 (M) — > L 2 (M) by 



F(/) = -L ^ e^f{s)ds. 



With this normalization of the Fourier transform, F 4 is the identity map. The 
possible eigenvalues of F are therefore ±1, ±i. Starting from (2.56), show that 
the <p n {x) are eigenfunctions of F, and that 

F(ip n ) = i n tp n {x). 
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Tchebychef polynomials 

Tchebychef polynomials are defined by taking a — — 1, b — +1 and w(x) = 
(1 — x 2 ) ±l l 2 . The Tchebychef polynomials of the first kind are 



T n {x) 



cosfncos 1 x 



The first few are 



T Q {x) = 1, 

Ti(x) = x, 

T 2 (x) = 2x 2 -l, 

T 3 (x) = Ax 3 -3x. 

The Tchebychef polynomials of the second kind are 



Un-Ax) 



sin(n cos x 



sin(cos~ 1 a;) n 



and the first few are 



(2.59) 



(2.60) 



U^(x 
U (x 
U,(x 
U 2 (x 
U 3 (x 



0, 

1, 

2x, 

Ax 2 - 1, 
8x 3 - Ax. 



T n and U n obey the same recurrence relation 

2,xT n T n j r \ -\- T n —ij 
2xll n = U n+1 + U n -i, 

which are disguised forms of elementary trigonometric identities. The orthog- 
onality is also a disguised form of the orthogonality of the functions cos n9 
and sin n6. After setting x = cos 9 we have 



cos nd cos rnO dO 



1 



X/ 



zT n (x)T m (x) dx = h n 5„ 



n, m, > 0, 
(2.61) 
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where h = ir, h n = ir/2, n > 0, and 

/ sin n6 sin mO d6 = / \[\ — x 2 U n -\{x)XJ m ^\{x) dx = —5 nm , n,m>0. 
Jo J-i 2 

(2.62) 

The set {T n (x)} is therefore orthogonal and complete in L 2 1 _ x2 ^_ 1/2 [—l,l], 
and the set {U n (x)} is orthogonal and complete in L 2 _ x2 y /2 [— 1, 1]. Any 
function continuous on the closed interval [—1, 1] lies in both of these spaces, 
and can therefore be expanded in terms of either set. 



2.3 Linear Operators and Distributions 

Our theme is the analogy between linear differential operators and matrices. 
It is therefore useful to understand how we can think of a differential operator 
as a continuously indexed "matrix." 



2.3.1 Linear operators 

The action of a matrix on a vector y = Ax is given in components by 

\ji = AijXj. (2.63) 
The function-space analogue of this, g = Af, is naturally to be thought of as 

g(x)= f A(x,y)f(y)dy, (2.64) 

J a 

where the summation over adjacent indices has been replaced by an inte- 
gration over the dummy variable y. If A(x, y) is an ordinary function then 
A(x, y) is called an integral kernel. We will study such linear operators in 
the chapter on integral equations. 
The identity operation is 

f{x)= [ b 6(x-y)f(y)dy, (2.65) 

J a 

and so the Dirac delta function, which is not an ordinary function, plays the 
role of the identity matrix. Once we admit distributions such as d~(x), we can 
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5 (x-a) 



) 


1 

b(x-a) 




,l 



Figure 2.3: Smooth approximations to S(x — a) and 5'(x — a). 



think of differential operators as continuously indexed matrices by using the 
distribution 

S'(x) = u 4-S(xY. (2.66) 
ax 

The quotes are to warn us that we are not really taking the derivative of the 
highly singular delta function. The symbol S'(x) is properly defined by its 
behaviour in an integral 



b r b 

dx 



$'{x-y)f(y)dy = I -i-S(x-y)f(y)dy 



a J a 



f b d 

/ f(y)-i-s( x -y)dy 

Ja dy 



f'(y)S(x — y) dy, (Integration by parts) 
= /'(*). 

The manipulations here are purely formal, and serve only to motivate the 
defining property 

r-b 

5'(x-y)f(y)dy = f'(x). (2.67) 



It is, however, sometimes useful to think of a smooth approximation to 
S'(x — a) being the genuine derivative of a smooth approximation to 8{x — a), 
as illustrated in figure 2.3. 
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We can now define higher "derivatives" of S(x) by 

/ 5 { - n) {x)f{x)dx = (-l) n / (n) (0), (2.68) 

J a 

and use them to represent any linear differential operator as a formal integral 
kernel. 

Example: In chapter one we formally evaluated a functional second derivative 
and ended up with the distributional kernel (1.186), which we here write as 

k(x,y) = -^(^p(y)-jj-5(y-x)^J +q(y)5(y-x) 

= -p(y)5"(y-x)-p'(y)5'(y-x)+q(y)5(y-x). (2.69) 
When k acts on a function u, it gives 

k(x,y)u(y)dy = j {-p(y)S"(y - x) - p'(y)5'(y - x) + q(y)5(y - x)} u(y) dy 
= J 5{y- x) {-\p(y)u(y)}" + \p{y)u{y)^ + q(y)u(y)} dy 
8{y-x) {-p(y)u"(y) - p'(y)u'(y) + q(y)u(y)} dy 



—j- \P( x )^r ) + q(x)u(x). (2.70) 



dx \ dx 

The continuous matrix (1.186) therefore does, as indicated in chapter one, 
represent the Sturm-Liouville operator L defined in (1.182). 

Exercise 2.6: Consider the distributional kernel 

k(x,y) = a 2 (y)S"(x - y) + ai{y)5'{x - y) + a {y)S(x - y). 

Show that 

J k(x, y)u(y) dy = (a 2 (x)u(x))" + (ai(x)u(x))' + a {x)u(x). 

Similarly show that 

k(x, y) = a 2 (x)5"(x - y) + a 1 (x)5'(x - y) + a (x)5(x - y), 

leads to 

J k(x, y)u(y) dy = a2(x)u"(x) + a\(x)u (x) + ao(x)u(x). 
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Exercise 2.7: The distributional kernel (2.69) was originally obtained as a 
functional second derivative 

By analogy with conventional partial derivatives, we would expect that 



( SJ[y] \ = S ( 5J[y] 



5y(xi) \5y(x 2 )J Sy(x 2 ) \Sy(xi)J ' 
but x\ and x 2 appear asymmetrically in k{x\,x-i). Define 

k T (xi,X2) = k(x 2 ,xi), 

and show that 

J k T {xi,X2)u(x2)dx2 = j k{xi,X2)u(x2)dX2. 

Conclude that, superficial appearance notwithstanding, we do have k(x±, X2) = 
k(x 2 ,x 1 ). 

The example and exercises show that linear differential operators correspond 
to continuously-infinite matrices having entries only infinitesimally close to 
their main diagonal. 

2.3.2 Distributions and test-functions 

It is possible to work most the problems in this book with no deeper under- 
standing of what a delta-function is than that presented in section 2.3.1. At 
some point however, the more careful reader will wonder about the logical 
structure of what we are doing, and will soon discover that too free a use 
of S(x) and its derivatives can lead to paradoxes. How do such creatures fit 
into the function-space picture, and what sort of manipulations with them 
are valid? 

We often think of 5(x) as being a "limit" of a sequence of functions whose 
graphs are getting narrower and narrower while their height grows to keep 
the area under the curve fixed. An example would be the spike function 
S £ (x — a) appearing in figure 2.4. 
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1/8 



x 



a 



Figure 2.4 



Approximation S £ (x 



a) to 5(x — a). 



The L 2 norm of 5, 



El 



8 £ \\ 2 = / %{x)\ 2 dx 



1 



(2.71) 



e 



tends to infinity as e — > 0, so 5 £ cannot be tending to any function in L 2 . 
This delta function has infinite "length," and so is not an element of our 
Hilbert space. 

The simple spike is not the only way to construct a delta function. In 
Fourier theory we meet 



which becomes a delta-function when A becomes large. In this case 



Again the "limit" has infinite length and cannot be accommodated in Hilbert 
space. This 5\(x) is even more pathological than S e . It provides a salutary 
counter-example to the often asserted "fact" that S(x) = for x ^ 0. As 
A becomes large 5a (0) diverges to infinity. At any fixed non-zero x, how- 
ever, <5a (x) oscillates between ±l/x as A grows. Consequently the limit 
liniA^oo $a( x ) exists nowhere. It therefore makes no sense to assign a numer- 
ical value to 5(x) at any x. 

Given its wild behaviour, is not surprising that mathematicians looked 
askance at Dirac's 5(x). It was only in 1944, long after its effectiveness in 




(2.72) 




(2.73) 
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solving physics and engineering problems had become an embarrassment, 
that Laurent Schwartz was able to tame S(x) by creating his theory of dis- 
tributions. Using the language of distributions we can state precisely the 
conditions under which a manoeuvre involving singular objects such as S'(x) 
is legitimate. 

Schwartz' theory is built on a concept from linear algebra. Recall that 
the dual space V* of a vector space V is the vector space of linear functions 
from the original vector space V to the field over which it is defined. We 
consider 5(x) to be an element of the dual space of a vector space T of test 
functions. When a test function (p(x) is plugged in, the 5-machine returns 
the number <p(0). This operation is a linear map because the action of S on 
\(p(x)-\-fj,x(x) is to return A</?(0) +//%(()) . Test functions are smooth (infinitely 
differentiable) functions that tend rapidly to zero at infinity. Exactly what 
class of function we chose for T depends on the problem at hand. If we are 
going to make extensive use of Fourier transforms, for example, we mght 
select the Schwartz space, «S(R). This is the space of infinitely differentiable 
functions <f(x) such that the seminorms 3 

Mm,n = s u P { \ x 

are finite for all positive integers m and n. The Schwartz space has the 
advantage that if (p is in «S(K), then so is its Fourier transform. Another 
popular space of test functions is V consisting of C°° functions of compact 
support — meaning that each function is identically zero outside some finite 
interval. Only if we want to prove theorems is a precise specification of T 
essential. For most physics calculations infinite differentiability and a rapid 
enough decrease at infinity for us to be able to ignore boundary terms is all 
that we need. 

The "nice" behaviour of the test functions compensates for the "nasty" 
behaviour of S(x) and its relatives. The objects, such as 5(x), composing the 
dual space of T are called generalized functions, or distributions. Actually, 
not every linear map T — > R is to be included in the dual space because, 
for technical reasons, we must require the maps to be continuous. In other 
words, if <p n — > if, we want our distributions u to obey u(ip n ) — > u(<p). Making 
precise what we mean by ip n — > <p is part of the task of specifying T. In the 

3 A seminorm | • • • | has all the properties of a norm except that \<p\ = docs not imply 
that Lp — 0. 



d m ifi 
dx m 



(2.74) 
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Schwartz space, for example, we declare that ip n — > ip if \<p n — (p\ n>m — > 0, for 
all positive m, n. When we restrict a dual space to continuous functionals, 
we usually denote it by V rather than V*. The space of distributions is 
therefore T . 

When they wish to stress the dual-space aspect of distribution theory, 
mathematically-minded authors use the notation 

8(<p) = <p(0), (2.75) 

or 

(M=¥>(0), (2-76) 
in place of the common, but purely formal, 

J 5{x)p{x)dx = (p(0). (2.77) 

The expression (5, p) here represents the pairing of the element ip of the 
vector space T with the element 5 of its dual space T'. It should not be 
thought of as an inner product as the distribution and the test function lie in 
different spaces. The "integral" in the common notation is purely symbolic, 
of course, but the common notation should not be despised even by those in 
quest of rigour. It suggests correct results, such as 



/ 



5(ax - b)ip(x) dx = ^-(p(b/a), (2.78) 
a 



which would look quite unmotivated in the dual-space notation. 
The distribution S'(x) is now defined by the pairing 

(6',<p) = V(0), (2.79) 

where the minus sign comes from imagining an integration by parts that 
takes the "derivative" off S(x) and puts it on to the smooth function <p(x): 

" / 5\x)p(x)dx n = - [ 5(x)p'(x)dx. (2.80) 



Similarly 5^(x) is now defined by the pairing 

(6( n \ip) = (-l)V n) (0)- (2.81) 
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The "nicer" the class of test function we take, the "nastier" the class 
of distributions we can handle. For example, the Hilbert space L 2 is its 
own dual: the Riesz-Frechet theorem (see exercise 2.10) asserts that any 
continuous linear map F : L 2 — > R can be written as F[f] = (I, /) for some 
I G L? . The delta-function map is not continuous when considered as a 
map from L 2 — > R however. An arbitrarily small change, / — > / + 5/, in a 
function (small in the L 2 sense of being small) can produce an arbitrarily 
large change in /(0). Thus L 2 functions are not "nice" enough for their 
dual space to be able accommodate the delta function. Another way of 
understanding this is to remember that we regard two L 2 functions as being 
the same whenever — f 2 \\ = 0. This distance will be zero even if fi 
and ji differ from one another on a countable set of points. As we have 
remarked earlier, this means that elements of L 2 are not really functions 
at all — they do not have an assigned valued at each point. They are, 
instead, only equivalence classes of functions. Since /(0) is undefined, any 
attempt to interpret the statement f S(x)f(x)dx = /(0) for / an arbitrary 
element L 2 is necessarily doomed to failure. Continuous functions, however, 
do have well-defined values at every point. If we take the space of test 
of functions T to consist of all continuous functions, but not demand that 
they be differentiable, then T will include the delta function, but not its 
"derivative" S'(x), as this requires us to evaluate f'{0). If we require the test 
functions to be once-differentiable, then T will include S'(x) but not S"(x), 
and so on. 

When we add suitable spaces T and T to our toolkit, we are constructing 
what is called a rigged^ Hilbert space. In such a rigged space we have the 
inclusion 



The idea is to take the space T big enough to contain objects such as the 
limit of our sequence of "approximate" delta functions 5 e , which does not 
converge to anything in L 2 . 

Ordinary functions can also be regarded as distributions, and this helps 
illuminate the different senses in which a sequence u n can converge. For 
example, we can consider the functions 



'Rigged" as in a sailing ship ready for sea, not "rigged" as in a corrupt election. 



T CL 2 = [L 2 ]' C r. 



(2.82) 



u n = sinmrx, < x < 1, 



(2.83) 



82 



CHAPTER 2. FUNCTION SPACES 



as being either elements of L 2 [0, 1] or as distributions. As distributions we 
evaluate them on a smooth function ip as 

(u n ,<p) = / ip(x)u n (x)dx. (2.84) 
Jo 

Now 

\im(u n ,<p) = 0, (2.85) 

n—>oo 

since the high-frequency Fourier coefficients of any smooth function tend 
to zero. We deduce that as a distribution we have lim^oo u n = 0, the 
convergence being pointwise on the space of test functions. Considered as 
elements of L 2 [0, 1], however, the u n do not tend to zero. Their norm is 
||ii n || = 1/2 and so all the u n remain at the same fixed distance from 0. 

Exercise 2.8: Here we show that the elements of L 2 [a,b], which we defined 
in exercise 2.2 to be the formal limits of of Cauchy sequences of continuous 
functions, may be thought of as distributions. 

i) Let p(x) be a test function and f n (x) a Cauchy sequence of continuous 
functions defining / £ L 2 . Use the Cauchy-Schwarz-Bunyakovsky in- 
equality to show that the sequence of numbers (tp, f n ) is Cauchy and so 
deduce that Hindoo (ip, f n ) exists. 

ii) Let <p(x) be a test function and fn\x) and fn\x) be a pair of equiva- 
lent sequences defining the same element / G L 2 . Use Cauchy-Schwarz- 
Bunyakovsky to show that 

lim ^,/(b_ / (2) )=0 . 

n— >oo 

Combine this result with that of the preceding exercise to deduce that 
we can set 

(ip,f) = lim (ip*,f n ), 

and so define / = linin^oo /„ as a distribution. 

The interpretation of elements of L 2 as distributions is simultaneously simpler 
and more physical than the classical interpretation via the Lebesgue integral. 



Weak derivatives 



By exploiting the infinite differentiability of our test functions, we were able 
to make mathematical sense of the "derivative" of the highly singular delta 
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function. The same idea of a formal integration by parts can be used to 
define the "derivative" for any distribution, and also for ordinary functions 
that would not usually be regarded as being differentiable. 

We therefore define the weak or distributional derivative v(x) of a distri- 
bution u(x) by requiring its evaluation on a test function ip e T to be 

J v(x)(p(x) dx = f — J u(x)(p'(x) dx. (2.86) 
In the more formal pairing notation we write 

(v,<p) = ~(u, if'). (2.87) 

The right hand side of (2.87) is a continuous linear function of tp, and so, 
therefore, is the left hand side. Thus the weak derivative u' = v is a well- 
defined distribution for any u. 

When u(x) is an ordinary function that is differentiable in the conven- 
tional sense, its weak derivative coincides with the usual derivative. When 
the function is not conventionally differentiable the weak derivative still ex- 
ists, but does not assign a numerical value to the derivative at each point. It 
is therefore a distribution and not a function. 

The elements of L 2 are not quite functions — having no well-defined 
value at a point — but are particularly mild-mannered distributions, and 
their weak derivatives may themselves be elements of L 2 . It is in this weak 
sense that we will, in later chapters, allow differential operators to act on L 2 
"functions." 

Example: In the weak sense 

-j-\x\ = sgn(x), (2.88) 
dx 

-^-sgn(x) = 28(x). (2.89) 

The object \x\ is an ordinary function, but sgn(x) has no definite value at 
x — 0, whilst S(x) has no definite value at any x. 

Example: As a more subtle illustration, consider the weak derivative of the 
function In \x\. With tp(x) a test function, the improper integral 

/oo / p—e r°°\ 

ip'(x) In \x\ dx = — lim ( / + / ) (p'(x) In \x\ dx (2.90) 
-oo ^'-OVi-oo Je> J 
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is convergent and defines the pairing (— In \x\, tp'). We wish to integrate by 
parts and interpret the result as ([In \x\]', tp). The logarithm is differentiable 
in the conventional sense away from x — 0, and 

[\n\x\tp(x)}' = -tp(x) +\n\x\tp'(x), x ^ 0. (2.91) 

Ob 

From this we find that 

-{\n\x\,tp') = lim^j^y +y ^ ^-tp(x)dx + (tp(e') In \e'\ - tp(-e) In |e|) | 

(2.92) 

So far e and e' are unrelated except in that they are both being sent to zero. 
If, however, we choose to make them equal, e = e' , then the integrated-out 
part becomes 

(p(e) -p(-e)) ln|e| ~ 2tp'(0)e\n \e\, (2.93) 
and this tends to zero as e becomes small. In this case 

-([In \x\],tp') = lim { (/ ' + jH) \<P{*) dx} . (2.94) 

By the definition of the weak derivative, the left hand side of (2.94) is the 
pairing ([In \x\]', tp). We conclude that 

d -hi\x\ = p(-\ (2.95) 



dx \x, 

where P(l/x), the principal-part distribution, is defined by the right-hand- 
side of (2.94). It is evaluated on the test function tp(x) by forming J tp(x)/xdx, 
but with an infinitesimal interval from — e to +e, omitted from the range 
of integration. It is essential that this omitted interval lie symmetrically 
about the dangerous point x — 0. Otherwise the integrated-out part will 
not vanish in the e — > limit. The resulting principal-part integral, written 
Pj tp(x)/xdx, is then convergent and P(l/x) is a well-defined distribution 
despite the singularity in the integrand. Principal-part integrals are common 
in physics. We will next meet them when we study Green functions. 

For further reading on distributions and their applications we recommend 
M. J. Lighthill Fourier Analysis and Generalised Functions, or F. G. Fried- 
lander Introduction to the Theory of Distributions. Both books are published 
by Cambridge University Press. 
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2.4 Further Exercises and Problems 

The first two exercises lead the reader through a proof of the Riesz-Frechet 
theorem. Although not an essential part of our story, they demonstrate how 
"completeness" is used in Hilbert space theory, and provide some practice 
with "e, 5" arguments for those who desire it. 

Exercise 2.9: Show that if a norm || || is derived from an inner product, then 
it obeys the parallelogram law 

\\f + g\\ 2 + \\f-g\\ 2 = 2(\\f\\ 2 + \\9\\ 2 ). 

Let N be a complete linear subspace of a Hilbert space H. Let g ^ N, and let 

inf || 5 -/|| =d. 

Show that there exists a sequence f n £ N such that lim^^oo ||/ ra — g\\ = d. 
Use the parallelogram law to show that the sequence /„ is Cauchy, and hence 
deduce that there is a unique / G N such that \\g — f\\ = d. From this, 
conclude that d > 0. Now show that ((g — f),h) =0 for all h G N. 

Exercise 2.10: Riesz-Frechet theorem. Let L[h] be a continuous linear func- 
tional on a Hilbert space H . Here continuous means that 

\\h n -h\\ ^ L[h n ] ^ L[h]. 

Show that the set N = {/ G H : L[f] = 0} is a complete linear subspace of H. 

Suppose now that there is a g £ H such that L{g) ^ 0, and let I G H be the 
vector ll g — /" from the previous problem. Show that 

L[h] = (al,h), where a = L[g]/(l, g) = L{g]/\\lf. 

A continuous linear functional can therefore be expressed as an inner product. 

Next we have some problems on orthogonal polynomials and three-term re- 
currence relations. They provide an excuse for reviewing linear algebra, and 
also serve to introduce the theory behind some practical numerical methods. 

Exercise 2.11: Let {P n (x)} be a family of polynomials orthonormal on [a, b] 
with respect to a a positive weight function w(x), and with deg [P n (x)] = n. 
Let us also scale w(x) so that j a w(x) dx = 1, and Po(^) = 1- 
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a) Suppose that the P n (x) obey the three-term recurrence relation 
xP n {x) = b n P n+1 (x) + a n P n (x) + b n - 1 P n -i(x); P-i(x) = 0, P (x) = 1. 
Define 

p n (x) = P n {x)(b n -ib n - 2 ■ ■ ■ b ), 

and show that 

xp n (x) = p n +l(x) + a n p n (x) + b^Pn-^x); P-l(x) = 0, p (x) = 1. 

Conclude that the p n (x) are monic — i.e. the coefficient of their leading 
power of x is unity. 

b) Show also that the functions 

r-f> 



Qn{x) = / — w{£) d£ 

J a x S 



are degree n— 1 monic polynomials that obey the same recurrence relation 
as the p n (x), but with initial conditions qo(x) = 0, q\(x) = J^wdx = 1. 

Warning: while the q n (x) polynomials defined in part b) turn out to be very 
useful, they are not mutually orthogonal with respect to ( , ) w . 

Exercise 2.12: Gaussian quadrature. Orthogonal polynomials have application 
to numerical integration. Let the polynomials {P n (x)} be orthonormal on [a, b] 
with respect to the positive weight function w(x), and let x v , v = 1, . . . , N be 
the zeros of Pn{x). You will show that if we define the weights 



w v = f —j — ^r— ^ -w(x) dx 

J a -L]\[\Xi;)(X — X u ) 



then the approximate integration scheme 

/ f(x)w(x) dx w w 1 f(x 1 ) + w 2 f(x 2 ) H w N f(x N ), 

J a 

known as Gauss' quadrature rule, is exact for f{x) any polynomial of degree 
less than or equal to 2N — 1. 

a) Let 7r(x) = (x — £i)(x — £2) ■ ■ ■ (x — £,n) be a polynomial of degree N. 
Given a function F{x), show that 

N , . 

dcf rn/> N ^(^J 



i/=i 



vr'(^)(x-C) 



is a polynomial of degree N — 1 that coincides with -F(x) at x = 
v = 1, . . . , N. (This is Lagrange's interpolation formula.) 
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b) Show that if F{x) is polynomial of degree N — 1 or less then Fl(x) = 
F{x). 

c) Let f(x) be a polynomial of degree 2N — 1 or less. Cite the polynomial 
division algorithm to show that there exist polynomials Q(x) and R(x), 
each of degree TV — 1 or less, such that 

f(x) = P N (x)Q(x) + R{x). 

d) Show that f(x u ) = R(x v ), and that 



rb rb 

j f(x)w(x)dx = / R{x)w(x) dx. 

J a J a 



e) Combine parts a), b) and d) to establish Gauss' result. 

f) Show that if we normalize w(x) so that f w dx = 1 then the weights w v 
can be expressed as w u = qN(x u )/p' N (x u ), where p n (x), q n {x) are the 
monic polynomials defined in the preceding problem. 

The ultimate large-iV exactness of Gaussian quadrature can be expressed as 



w{x) = Vim. ^(x-x^J. 

Of course, a sum of Dirac delta-functions can never become a continuous 
function in any ordinary sense. The equality holds only after both sides are 
integrated against a smooth test function, i.e., when it is considered as a 
statement about distributions. 

Exercise 2.13: The completeness of a set of polynomials {P n (x)}, orthonor- 
mal with respect to the positive weight function w(x), is equivalent to the 
statement that 

OO 

Y J Pn(x)P n (y) = —-5(x-y). 
n=0 V ' 

It is useful to have a formula for the partial sums of this infinite series. 
Suppose that the polynomials P n {x) obey the three-term recurrence relation 

xP n (x) = b n P n+ i(x) + a n P n {x) + 6„_iP n _i(x); P-i(x) = 0, P (x) = 1. 

Use this recurrence relation, together with its initial conditions, to obtain the 
Christoffel-Darboux formula 

V P n (x)P n {y) = hN -^ PN ^ x ) PN ^y) ~ PN-i(x)P N (y)} 
„ n n x — y 

n=0 y 
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Exercise 2.14: Again suppose that the polynomials P n (x) obey the three-term 
recurrence relation 

xP n {x) = b n P n+1 (x) + a n P n (x) + 6„_iP n _i(x); P_i(x) = 0, P (x) = 1. 

Consider the iV-by-iV tridiagonal matrix eigenvalue problem 
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a) Show that the eigenvalues x are given by the zeros x u , v = 1,...,N 
of Pn(x), and that the corresponding eigenvectors have components 

U n = P n (Xu), U = 0, . . . , N - 1. 

b) Take the x —> y limit of the Christoffel-Darboux formula from the preced- 
ing problem, and use it to show that the orthogonality and completeness 
relations for the eigenvectors can be written as 

JV-l 

^ ^ Pn{Xu)Pn(% fi) = U! v 5 U ^, 
n=0 

N 

y^^w v P n {x v )P m (x v ) = 5 nm , n.m<N-l, 

v=\ 

where w' 1 = b N -iP N (x u )P N -i(x u ). 

c) Use the original Christoffel-Darboux formula to show that, when the 
P n {x) are orthonormal with respect to the positive weight function w(x), 
the normalization constants w u of this present problem coincide with the 
weights w v occurring in the Gauss quadrature rule. Conclude from this 
equality that the Gauss-quadrature weights are positive. 

Exercise 2.15: Write the N-by-N tridiagonal matrix eigenvalue problem from 
the preceding exercise as Hu = xu, and set cLn{x) = det (xl — H). Similarly 
define d n (x) to be the determinant of the n-by-n tridiagonal submatrix with x— 
a n -i, . . . , x — ao along its principal diagonal. Laplace-develop the determinant 
d n (x) about its first row, and hence obtain the recurrence 

d n +i{x) = (x - a n )d n {x) - bl_ 1 d n ^i(x). 
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Conclude that 

det (xl — H) = pn(x), 
where p n {x) is the monic orthogonal polynomial obeying 

XPn{x) = Pn+l(x) + a n p n (x) + tfr^Pn-^x); P-±(x) = 0, p (x) = 1. 

Exercise 2.16: Again write the N-by-N tridiagonal matrix eigenvalue problem 
from the preceding exercises as Hu = xu. 

a) Show that the lowest and rightmost matrix element 

(0|(xI-H)- 1 |0)^(xI-H) 00 1 

of the resolvent matrix (xl — H) _1 is given by a continued fraction 
Gn-i,o(x) where, for example, 

G 3>z (x) = 1 



x — do — 



bl 



bl 

x-ai — 



bi 

x - a 2 - 



x — 03 + z 
b) Use induction on n to show that 

q n {x)z + q n +i(x) 



G n , z ( x ) 



p n (x)z+p n+1 (x)' 



where p n (x), q n (x) are the monic polynomial functions of x defined by 
the recurrence relations 

xp n (x) = p n +l{x) + a n p n {x) + ft^^-i (x) , P-±(x) = 0, p (x) = 1, 

xg n (x) = q n +i{x) + a n q n {x) + b^qn-iix) , q (x) = 0, qi{x) = 1. 



b) Conclude that 



has a pole singularity when x approaches an eigenvalue x u . Show that 
the residue of the pole (the coefficient of l/(x — x n )) is equal to the 
Gauss-quadrature weight w v for u>(x), the weight function (normalized 
so that Jwdx = 1) from which the coefficients a n , b n were derived. 
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Continued fractions were introduced by John Wallis in his Arithmetica 
Infinitorum (1656), as was the recursion formula for their evaluation. Today, 
when combined with the output of the next exercise, they provide the math- 
ematical underpinning of the Haydock recursion method in the band theory 
of solids. Haydock's method computes w(x) = hin^^oo {$2 V S(x — x u )w u }, 
and interprets it as the local density of states that is measured in scanning 
tunnelling microscopy. 

Exercise 2.17: The Lanczos tridiagonalization algorithm. Let V be an N- 
dimensional complex vector space equipped with an inner product ( , ) and 
let H : V — ► V be a hermitian linear operator. Starting from a unit vector uo, 
and taking u_i = 0, recursively generate the unit vectors u n and the numbers 

a n , b n and c n by 

Hu n = b n u n+ i + a n M n + Cn_iu n _i, 
where the coefficients 

a n = (u n ,Hu n ), 
c n -i = (u n -i,Hu n ), 
ensure that u n+ i is perpendicular to both u n and u n _i, and 

b n = \\Hu n — a n u n — c„_iu n _i||, 
a positive real number, makes ||u n+ i|| = 1. 

a) Use induction on n to show that u n+ i, although only constructed to be 
perpendicular to the previous two vectors, is in fact (and in the absence 
of numerical rounding errors) perpendicular to all u m with m < n. 

b) Show that a n , c n are real, and that c n _i = b n -\. 

c) Conclude that b^-i = 0, and (provided that no earlier b n happens to 
vanish) that the u n , n = 0, . . . , N — 1, constitute an orthonormal basis 
for V, in terms of which H is represented by the N-by-N real-symmetric 
tridiagonal matrix H of the preceding exercises. 

Because the eigenvalues of a tridiagonal matrix are given by the numerically 
easy-to-find zeros of the associated monic polynomial pn(x), the Lanczos al- 
gorithm provides a computationally efficient way of extracting the eigenvalues 
from a large sparse matrix. In theory, the entries in the tridiagonal H can be 
computed while retaining only u n , u n _i and Hu n in memory at any one time. 
In practice, with finite precision computer arithmetic, orthogonality with the 
earlier u m is eventually lost, and spurious or duplicated eigenvalues appear. 
There exist, however, stratagems for identifying and eliminating these fake 
eigenvalues. 
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The following two problems are "toy" versions of the Lax pair and tau func- 
tion constructions that arise in the general theory of soliton equations. They 
provide useful practice in manipulating matrices and determinants. 

Problem 2.18: The monic orthogonal polynomials Pi(x) have inner products 

{Pi,Pj) w = J Pi(x)pj(x)w(x)dx = hiSij, 
and obey the recursion relation 

xpiix) = Pi+i{x) + aiPi(x) + 6-_iPj-i(a;); p-i{x) = 0, p (x) = 1. 
Write the recursion relation as 

Lp = xp, 

where 



L = 


... 1 


&2 


bi 





, P = 


P2 




... 


1 


a 1 


bl 


Pi 




.... 





1 


a . 




-Po- 



Suppose that 



w(x) = exp < - t n x 



n=l 



and consider how the pi(x) and the coefficients and b~ vary with the pa- 
rameters t n . 



a) Show that 



dp 



M (n) 



P, 



where is some strictly upper triangular matrix - i.e. all entries on 

and below its principal diagonal are zero, 
b) By differentiating Lp = xp with respect to t n show that 



dL 

dt n 

c) Compute the matrix elements 



[M( n ',L]. 
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(note the interchange of the order of i and j in the ( , } product!) by 
differentiating the orthogonality condition (pi,Pj) w = hidij. Hence show 
that 

M (n) = (jn) + 

where (L n ) + denotes the strictly upper triangular projection of the n'th 
power of L — i.e. the matrix L n , but with its diagonal and lower trian- 
gular entries replaced by zero. 



Thus 



dt n 



= [(L n ) + ,L] 



describes a family of deformations of the semi-infinite matrix L that, in some 
formal sense, preserve its eigenvalues x. 

Problem 2.19: Let the monic polynomials p n (x) be orthogonal with respect 
to the weight function 



w(x) = exp < — tnX r ' 



71=1 



Define the "tau-function" T n (t±, t 2 , £3 • • •) of the parameters U to be the n-fold 
integral 

i- 1- i- ( n 00 } 

r n (h,t 2 ,...) = //••• / dx x dx 2 ■ ■ ■ dx n A 2 (x) exp < ^ t m x™ > 

I. v=\ m=l ) 



where 



A(x) 



n-l n-2 
n-l rt-2 



™n— 1 „,n— 2 



Xi 1 
X 2 1 

x n 1 



is the n-by-n Vandermonde determinant, 
a) Show that 



it. O it/ O 



Xl 1 

x 2 1 



™n-l „n-2 1 
x n x n . . . u^n j. 



Pn-l(xi) Pn-2(xi) ■■■ Pl(xi) Pq{x{) 
Pn-l(x 2 ) Pn-2{x 2 ) ■■■ Pl(x 2 ) Po(x 2 ) 



P n -l(x n ) Pn-2{x n ) ■■■ Pl{x n ) Po(Xr 
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b) Combine the identity from part a) with the orthogonality property of the 
p n (x) to show that 

i r n [ n 1 

Pn(x) = — dx 1 dx 2 . . . dx n A 2 (x) JJ (x - x M ) exp I - ^ ^ t m x™ > 

Tn J »=1 { v=\ m=l J 



x n T n(^l}t' 2 ,t' 3 , . . .) 
T n (tl,t2,ts, . . .) 

where 

mx m 

Here are some exercises on distributions: 

Exercise 2.20: Let f(x) be a continuous function. Observe that f(x)5(x) = 
f{0)5{x). Deduce that 

±[f( x )S(x)]=f(0)5'(x). 

If /(x) were differentiate we might also have used the product rule to conclude 
that 

±[f( x )5(x)}=f'{x)5(x) + f(x)5\x). 

Show, by evaluating f(0)S'(x) and f'(x)S(x) + f(x)S'(x) on a test function 
<p(x), that these two expressions for the derivative of f(x)S(x) are equivalent. 

Exercise 2.21: Let ip(x) be a test function. Show that 



d_ 

dt 



Show further that the right-hand-side of this equation is equal to 

dx \x-tj ) (x - t) 

Exercise 2.22: Let 9{x) be the step function or Heaviside distribution 

(I, x > 0, 

0{x) = < undefined, x = 0, 
I 0, x < 0. 
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By forming the weak derivative of both sides of the equation 



conclude that 



lim ln(x + is) = In Ixl + iir6(—x) 



lim ( _J_ ] = p ( I ] _ i n S(x). 
e^o+ \x + ie \x ' 



Exercise 2.23: Use induction on n to generalize exercise 2.21 and show that 

n-1 



dt n 1 J-oo (x - t) 



dx 



v(*)-E^(*- t )V m) (*) 



m=0 



P / ±- — dx. 



Exercise 2.24: Let the non-local functional be defined by 

Compute the functional derivative of £[/] and verify that it is given by 
5S 



5f(x) 



x — X 



See exercise 6.10 for an occurence of this functional. 



Chapter 3 



Linear Ordinary Differential 
Equations 

In this chapter we will discuss linear ordinary differential equations. We will 
not describe tricks for solving any particular equation, but instead focus on 
those aspects the general theory that we will need later. 

We will consider either homogeneous equations, Ly = with 

Ly = p Q (x)y^+ Pl (x)y^ + ■ ■ ■+ Pn (x)y, (3.1) 

or inhomogeneous equations Ly = f. In full, 

Po(x)y^+ Pl (x)y^ + • • • + p n (x)y = f(x). (3.2) 

We will begin with homogeneous equations. 

3.1 Existence and Uniqueness of Solutions 

The fundamental result in the theory of differential equations is the existence 
and uniqueness theorem for systems of first order equations. 

3.1.1 Flows for first-order equations 

Let system of coordinates in IR n , and let X t (x 1 , x 2 , . . . , x n , t), 

i — 1, . . . ,n, be the components of a t-dependent vector field. Consider the 
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system of first-order differential equations 

dx 1 112 
^ = X (x , x , . . . , x , t), 

dx 2 2 ] 2 

~ = X (x ,x , . . . ,x ,t), 
dr n 

— = X n (x\x 2 ,...,x n ,t). (3.3) 

For a sufficiently smooth vector field (X 1 , X 2 , . . . , X n ) there is a unique solu- 
tion x % (t) for any initial condition x l (0) = Xq. Rigorous proofs of this claim, 
including a statement of exactly what "sufficiently smooth" means, can be 
found in any standard book on differential equations. Here, we will simply 
assume the result. It is of course "physically" plausible. Regard the X 1 as 
being the components of the velocity field in a fluid flow, and the solution 
x % (t) as the trajectory of a particle carried by the flow. An particle initially at 
x l (0) = Xq certainly goes somewhere, and unless something seriously patho- 
logical is happening, that "somewhere" will be unique. 
Now introduce a single function y(t), and set 

x 1 = y, 

x 2 = y, 

x 3 = y, 

x n = y^, (3.4) 

and, given smooth functions Po(t), . . . ,p n (t) with po(t) nowhere vanishing, 
look at the particular system of equations 

dx 1 



dt 
dx 2 

~dt 



x 2 , 



x 3 , 



dx 71 - 1 


= x n , 


dt 


dx n 


1 


dt 


Po 



(3.5) 
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This system is equivalent to the single equation 

Po(t) S +Pl{t)d ~dF=^ + ' ' ■ + Pn~i(t)^ + Pn (t)y(t) = 0. (3.6) 

Thus an n-th order ordinary differential equation (ODE) can be written as a 
first-order equation in n dimensions, and we can exploit the uniqueness result 
cited above. We conclude, provided p never vanishes, that the differential 
equation Ly = has a unique solution, y(t), for each set of initial data 
(y(0),y(0),y(0),...,y( n -V(0)). Thus, 

i) If Ly = and y(0) = 0, y(0) = 0, y(0) = 0, . . ., y (n_1) (0) = 0, we 
deduce that y = 0. 

ii) If yi(t) and y 2 (£) obey the same equation Ly = 0, and have the same 
initial data, then yi(t) = y 2 (£). 



3.1.2 Linear independence 

In this section we will assume that p does not vanish in the region of x we are 
interested in, and that all the Pi remain finite and differentiable sufficiently 
many times for our formulae to make sense. 

Consider an n-th order linear differential equation 

p (x)y^+ Pl (x)y (n -V + ■ ■ -+p n (x)y = 0. (3.7) 

The set of solutions of this equation constitutes a vector space because if y\ (x) 
and 2/2(2;) are solutions, then so is any linear combination Xyi(x) + //y 2 (^)- 
We will show that the dimension of this vector space is n. To see that this 
is so, let yi(x) be a solution with initial data 

2/i(0) = 1, 
2/i(0) = 0, 



y { r 1} = 0, (3.8) 

2/2(0) = 0, 
l/ 2 (0) = 1, 



let 2/2(2;) be a solution with 



(3.9) 
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and so on, up to y n (x), which has 

Vn(0) = 0, 

y' n (o) = o, 

y { r l) = i- (3.io) 

We claim that the functions yi{x) are linearly independent. Suppose, to the 
contrary, that there are constants Ai, . . . , X n such that 

= Xiyi(x) + X 2 y 2 (x) H h A„y n (a;). (3.11) 

Then 

= X m (0) + X 2 y 2 (0) + ■ ■ ■ + X n y n (0) X, = 0. (3.12) 
Differentiating once and setting x = gives 

= Ai3/i(0) + X 2 y' 2 (0) + ■■■ + X n y' n (0) X 2 = 0. (3.13) 

We continue in this manner all the way to 

= X 1 y ( r 1) (0) + A 2 yf- 1) (0) + • • • + A ri2/ i- 1 )(0) =► X n = 0. (3.14) 

All the Aj are zero! There is therefore no non-trivial linear relation between 
the yi(x), and they are indeed linearly independent. 

The solutions yi(x) also span the solution space, because the unique solu- 
tion with intial data y(0) = a±, y'(0) = a 2 , . . ., y^ n ~ 1 ^ ) (0) = a n can be written 
in terms of them as 

y(x) = a iyi (x) + a 2 y 2 (x) H a n y n (x). (3.15) 

Our chosen set of n solutions is therefore a basis for the solution space of the 
differential equation. The dimension of the solution space is therefore n, as 
claimed. 



3.1.3 The Wronskian 



If we manage to find a different set of n solutions, how will we know whether 
they are also linearly independent? The essential tool is the Wronskian: 



W(y!,...,y n ;x) 



dcf 



yi 


yi 


y n 


y[ 


y'2 ■ 


y' n 


(n-1) 

y\ 


(n-1) 

y 2 


(n-1) 

• y n 



(3.16) 
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Recall that the derivative of a determinant 



D 



an a 12 

0-21 0,22 



Clin 
Cl2n 



Clnl Q"n2 • • • Cl nn 

may be evaluated by differentiating row-by-row: 



(3-17) 



dD 

dx 



a 21 a 22 



Qnl O n 2 



"In 
&2n 



+ 



an 0-12 

a 21 a 22 



O n l Cl n2 



Clin 
a 2n 



+ ■ ■ ■+ 



an cii2 

a 2 l CI22 



Clin 
Cl2n 



h nl 



L n2 



Applying this to the derivative of the Wronskian, we find 

Vl V2 ••• Vn 



dW 

dx 



y'i 



y'2 



[n) [n) 

vi v\ 



V'n 



(n) 
Vn 



(3.18) 



Only the term where the very last row is being differentiated survives. All 
the other row derivatives gives zero because they lead to a determinant with 
two identical rows. Now, if the are all solutions of 



p y {n) +Piy {n - 1) + --- + p n y = o, 



we can substitute 



— [PiVt 
Po v 



(n-l) 



+ P2V- 



(n-2) 



H h p 



• n y?j , 



(3.19) 



(3.20) 



use the row-by-row linearity of determinants, 



Aan + fib n Xa u + fib 12 

C 2 1 C 2 2 



\a\ n + fibi r 

C2n 



C n l 



A 



C„2 
a ll a 12 
C21 C 2 2 

c nl c n2 



din 
C2n 



"11 "12 

C21 C 22 

Cnl C„2 



bin 

C2n 



, (3.21) 
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and find, again because most terms have two identical rows, that only the 
terms with p\ survive. The end result is 



dW 



dx \po 
Solving this first order equation gives 

W(yi, x) = W(yi, x ) exp 



pi(Q 

Po(0 



di 



(3.22) 



(3.23) 



Since the exponential function itself never vanishes, W(x) either vanishes at 
all x, or never. This is Liouville's theorem, and (3.23) is called Liouville's 
formula. 

Now suppose that yi, . . . ,y n are a set of C n functions of x, not necessarily 
solutions of an ODE. Suppose further that there are constants Aj, not all zero, 
such that 

Aiyi(x) + X 2 y 2 (x) + ■■■ + X n y n (x) = 0, (3.24) 
{i.e. the functions are linearly dependent) then the set of equations 

Aiyi(x) + X 2 y 2 (x) H h \ n y n (x) = 0, 

XMx) + X 2 y' 2 (x) + ■ ■ ■ + X n y' n (x) = 0, 



A 1 y{ n - 1) (x) + A 2 ^- 1) (x) + ... + A n yi"- 1 )(x) = 0, 



(3.25) 



has a non-trivial solution Ai, A 2 , . . . , A n , and so the determinant of the coef- 
ficients, 



W 



yi 



V2 

y' 2 



(n-l) (n-1) 



y n 
y'n 



(n-l) 



(3.26) 



must vanish. Thus 



linear dependence =^> W = 0. 



There is a partial converse of this result: Suppose that yi, . . . ,y n are solutions 
to an n-th order ODE and W(yi] x) = at x = xq. Then there must exist a 
set of Ai, not all zero, such that 



Y(x) = Xmix) + X 2 y 2 (x) H h X n y n {x) 



(3.27) 
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has = Y(xq) = Y'(x ) = ■ ■ ■ = Y ( - n ~ 1 \x ). This is because the system of 
linear equations determining the Aj has the Wronskian as its determinant. 
Now the function Y(x) is a solution of the ODE and has vanishing initial 
data. It is therefore identically zero. We conclude that 



ODE and W = =>■ linear dependence. 



If there is no ODE, the Wronskian may vanish without the functions 
being linearly dependent. As an example, consider 



V2 (x) = 



o, 




x<0, 


exp{- 




x>0. 


exp{- 


-1/x 2 }, 


x<0, 


o, 




x>0. 



(3.28) 



We have W(yi, y 2 ; x) = 0, but y±, y 2 are not proportional to one another, and 
so not linearly dependent. (Notethat y± >2 are smooth functions. In particular 
they have derivatives of all orders at x = 0.) 

Given n linearly independent smooth functions y^ can we always find an 
n-th order differential equation that has them as its solutions? The answer 
had better be "no" , or there would be a contradiction between the preceeding 
theorem and the counterexample to its extension. If the functions do satisfy 
a common equation, however, we can use a Wronskian to construct it: Let 



Ly = po{x)y in) + Pi{x)y in 1] + • • • + p n (x)y 
be the differential polynomial in y(x) that results from expanding 

.(n-l) 



(3.29) 



D(y) 



yi n ) 
(n) 



(n) 
Vn 



(n-l) 



yi 



(n-l) 

y n 



y 
yi 

y n 



(3.30) 



Whenever y coincides with any of the y iy the determinant will have two 
identical rows, and so Ly = 0. The yi are indeed n solutions of Ly = 0. As 
we have noted, this construction cannot always work. To see what can go 
wrong, observe that it gives 



Po(x) 



(n-l) 

yi 

(n-l) 

y\ 



(n-l) 

yn 



(n-2) 

yi 

(n-2) 

y 2 



(n-2) 

y n 



yi 

1/2 

y n 



W(y;x) 



(3.31) 
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If this Wronskian is zero, then our construction fails to deliver an n-th order 
equation. Indeed, taking y\ and 1/2 to be the functions in the example above 
yields an equation in which all three coeffecients po, pi, P2 are identically 
zero. 



3.2 Normal Form 

In elementary algebra a polynomial equation 

a x n + a 1 x n ~ l + •••<*„ = 0, (3.32) 

with a 7^ 0, is said to be in normal form if a\ = 0. We can always put such an 
equation in normal form by defining a new variable x with x = x — ai(nao) -1 . 

By analogy, an n-th. order linear ODE with no y( n ~^ term is also said to 
be in normal form. We can put an ODE in normal form by the substitution 
y = wy, for a suitable function w(x). Let 

p y (n) + Piy {n - l) + ■ ■ ■ + Vnv = 0. (3.33) 

Set y = wy. Using Leibniz' rule, we expand out 

(wy)W = W y {n) + nw'y {n ' l) + W ^ W ~ tL w "y^ + ■■■ + w {n) y. (3.34) 

The differential equation becomes, therefore, 

(w Po )y {n) + ( Pl w + ponw'jy^-V + • • • = 0. (3.35) 
We see that if we chose w to be a solution of 

Piw + ponw' = 0, (3.36) 

for example 

then y obeys the equation 

(^o)y (n) +M (n " 2) + -"- = 0, (3.38) 
with no second-highest derivative. 
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Example: For a second order equation, 

y" + Piy' + p 2 y = o, (3.39) 

we set y(x) = v(x) exp{ — \ Jq Pi(£)d£} and find that v obeys 

v " + Vlv = 0, (3.40) 

where 

V = P2-\p' 1 -\p\. (3.41) 

Reducing an equation to normal form gives us the best chance of solving 
it by inspection. For physicists, another advantage is that a second-order 
equation in normal form can be thought of as a Schrodinger equation, 

° ^ + (V(x) - E)i/j = 0, (3.42) 



dx 2 



and we can gain insight into the properties of the solution by bringing our 
physics intuition and experience to bear. 



3.3 Inhomogeneous Equations 

A linear inhomogeneous equation is one with a source term: 

Po (x)y^ + Pl (x)y^ + ■■■+ p n (x)y = f(x). (3.43) 

It is called "inhomogeneous" because the source term f(x) does not contain 
y, and so is different from the rest. We will devote an entire chapter to 
the solution of such equations by the method of Green functions. Here, we 
simply review some elementary material. 

3.3.1 Particular integral and complementary function 

One method of dealing with inhomogeneous problems, one that is especially 
effective when the equation has constant coefficients, is simply to try and 
guess a solution to (3.43). If you are successful, the guessed solution y-pi 
is then called a particular integral. We may add any solution y CF of the 
homogeneous equation 

p (x)y^ + Pl (x)y( n -V + ...+ Pn ( x )y = (3.44) 
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to ypi and it will still be a solution of the inhomogeneous problem. We 
use this freedom to satisfy the boundary or initial conditions. The added 
solution, ycF, is called the complementary function. 

Example: Charging capacitor. The capacitor in the circuit in figure 3.1 is 
initially uncharged. The switch is closed at t — 



V 



c 

— Q 



-^ww — 

R 

Figure 3.1: Capacitor circuit 

The charge on the capacitor, Q, obeys 

dQ Q , 

where R, C, V are constants. A particular integral is given by Q(t) = CV . 
The complementary-function solution of the homogeneous problem is 

Q(t) = Q e^ RC , (3.46) 

where Qo is constant. The solution satisfying the initial conditions is 

Q(t) = CV (l - e- l l RC ) . (3.47) 

3.3.2 Variation of parameters 

We now follow Lagrange, and solve 

Po(x)y (n) + pi{x)y^ + ■■■ + Pn (x)y = f(x) (3.48) 

by writing 

y = viyx + v 2 y 2 H h v n y n (3.49) 

where the are the n linearly independent solutions of the homogeneous 
equation and the Vi are functions of x that we have to determine. This 
method is called variation of parameters. 
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Now, differentiating gives 

y' = viy[ + v 2 y' 2 H h v n y' n + {u^ + ^2/2 H h • (3.50) 

We will chose the i>'s so as to make the terms in the braces vanish. Differen- 
tiate again: 

y" = viy'l + v 2 y' 2 ' + ■■■ + v n y" n + + v' 2 y' 2 + ■■■ + v'JJ . (3.51) 

Again, we will chose the u's to make the terms in the braces vanish. We 
proceed in this way until the very last step, at which we demand 

{v[yt 1] + v' 2 yt 1] + ■■■ + v' n y n n - 1 ) = f(x)/p (x). (3.52) 

If you substitute the resulting y into the differential equation, you will see 
that the equation is satisfied. 

We have imposed the following conditions on v[: 

v[yi + v' 2 y 2 H h v' n y n = 0, 

vWi + v 2 y 2 + --- + <y'n = o, 

v' 1 y ( r l) + v' 2 y ( r l) + --- + v' n y n n - 1 = f(x)/ Po (x). (3.53) 

This system of linear equations will have a solution for v[, . . . ,v' n , provided 
the Wronskian of the yi is non-zero. This, however, is guaranteed by the 
assumed linear independence of the yi. Having found the v[, . . . , v' n , we obtain 
the vi, . . . , v n themselves by a single integration. 

Example: First-order linear equation. A simple and useful application of this 
method solves 

^+P(x)y = f(x). (3.54) 
The solution to the homogeneous equation is 

y^e-XW*. (3.55) 

We therefore set 

y = v(x)e-f* p ( 8)ds , (3.56) 

and find that 

v'{x)e~^ p{s)ds = f(x). (3.57) 
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We integrate once to find 



v 



(*)= f /(0^ p(s)d X, 



(3.58) 



and so 



(3.59) 



We select b to satisfy the initial condition. 



3.4 Singular Points 



So far in this chapter, we have been assuming, either explicitly or tacitly, that 
our coefficients Pi (x) are smooth, and that po(x) never vanishes. lfp (x) does 
become zero (or, more precisely, if one or more of the Pi/po becomes singular) 
then dramatic things happen, and the location of the zero of po is called a 
singular point of the differential equation. All other points are called ordinary 
points. 

In physics application we often find singular points at the ends of the 
interval in which we wish to solve our differential equation. For example, the 
origin r = is often a singular point when r is the radial coordinate in plane 
or spherical polars. The existence and uniqueness theorems that we have 
relied out throughout this chapter may fail at singular endpoints. Consider, 
for example, the equation 



which is singular at x — 0. The two linearly independent solutions for x > 
are y±(x) = 1 and y 2 (x) = \nx. The general solution is therefore A + Blnx, 
but no choice of A and B can satisfy the initial conditions y(0) = a, y'(0) = b 
when b is non-zero. Because of these complications, we will delay a systematic 
study of singular endpoints until chapter 8. 

3.4.1 Regular singular points 

If, in the differential equation 



xy" + y' = 0, 



(3.60) 



Poy" + Piy' + P2y = 0, 



(3.61) 
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we have a point x = a such that 

p (x) = (x - a) 2 P(x), pi(x) = (x - a)Q(x), p 2 (x) = R(x), (3.62) 

where P and Q and R are analytic 1 and P and Q non-zero in a neighbourhood 
of a then the point x = a is called a regular singular point of the equation. 
All other singular points are said to be irregular. Close to a regular singular 
point a the equation looks like 

P(a)(x - afy" + Q(a)(x - a)y' + R(a)y = 0. (3.63) 

The solutions of this reduced equation are 

yi = (a;-a)\ y 2 = (x-a) x \ (3.64) 

where \\ t2 are the roots of the indicial equation 

A(A - l)P(a) + XQ(a) + R(a) = 0. (3.65) 

The solutions of the full equation are then 

Vi = (x- a) Al /i(:r), y 2 = (x - a) x *f 2 (x), (3.66) 

where f\ )2 have power series solutions convergent in a neighbourhood of a. 
An exception occurs when Ai and A2 coincide or differ by an integer, in which 
case the second solution is of the form 

y 2 = {x- a) Al (ln(x - a)f x {x) + / 2 (x)), (3.67) 

where fi is the same power series that occurs in the first solution, and f 2 is 
a new power series. You will probably have seen these statements proved by 
the tedious procedure of setting 

h(x) = (x - a)\b + b 1 (x-a) + b 2 (x - a) 2 + ■ ■ ■ , (3.68) 

and obtaining a recurrence relation determining the fej. Far more insight is 
obtained, however, by extending the equation and its solution to the com- 
plex plane, where the structure of the solution is related to its monodromy 
properties. If you are familiar with complex analytic methods, you might like 
to look ahead to the discussion of monodromy in section ??. 



: A function is analytic at a point if it has a power-series expansion that converges to 
the function in a neighbourhood of the point. 
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3.5 Further Exercises and Problems 

Exercise 3.1: Reduction of Order. Sometimes additional information about 
the solutions of a differential equation enables us to reduce the order of the 
equation, and so solve it. 

a) Suppose that we know that y\ = u(x) is one solution to the equation 

y" + V(x)y = 0. 
By trying y = u(x)v{x) show that 



/X 



2/2 = U^j , 

- u z {£) 

is also a solution of the differential equation. Is this new solution ever 
merely a constant mutiple of the old solution, or must it be linearly 
independent? (Hint: evaluate the Wronskian W(y 2 , yi)-) 

b) Suppose that we are told that the product, 2/12/2, of the two solutions to 
the equation y" + p±y' + p2y = is a constant. Show that this requires 
2piP2 + P 2 = 0. 

c) By using ideas from part b) or otherwise, find the general solution of the 
equation 

(x + l)x 2 y" + xy' - (x + l) 3 y = 0. 

Exercise 3.2: Show that the general solution of the differential equation 

d 2 y dy e x 

— - - 2— + y = 

dx 2 dx 1 + x 2 

is 

y(x) = Ae x + Bxe x - \e x ln(l + x 2 ) + xe x tan _1 x. 

Exercise 3.3: Use the method of variation of parameters to show that if yi(x) 
and 2/2 (x) are linearly independent solutions to the equation 

/ ,d 2 y dy 
^'dx 2 Pl ^ X 'dx~ +P2 ^ x > y = °' 
then general solution of the equation 

po ^cLf +Pl ^'d\ +p2 ^ y = f( x ^ 

is 

y(x )=Ay 1 (x) + By 2 (x )-y 1 (x ) — — -d£ + y 2 (x) — — -d£. 

J PoW(yi,y 2 ) J PoW(yi,y 2 ) 
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Problem 3.4: One-dimensional scattering theory. Consider the one-dimensional 
Schrodinger equation 

where V(x) is zero except in a finite interval [—a, a] near the origin. 




Figure 3.2: A typical potential V for exercise 3.4 



Let L denote the left asymptotic region, — oo < x < —a, and similarly let R 
denote a < x < oo. For E = k 2 there will be scattering solutions of the form 



f e ikx + r L (k)e~ ikx , x G L, 
\ t L (k)e ikx , xeR, 



which for k > describe waves incident on the potential V{x) from the left. 
There will be solutions with 



f t R (k)e ikx , xeL, 
\e ikx +r R (k)e- ikx , x G R, 



which for k < describe waves incident from the right. The wavefunctions 
in [—a, a] will naturally be more complicated. Observe that [^(x)]* is also a 
solution of the Schrodinger equation. 

By using properties of the Wronskian, show that: 

a) \r LtR \ 2 + \t LtR \ 2 = 1, 

b) t L (k)=t R (-k). 

c) Deduce from parts a) and b) that |rL(fc)| = \r R (—k)\. 

d) Take the specific example of V{x) = \5{x — b) with |6| < a. Compute the 
transmission and reflection coefficients and hence show that ri{k) and 
r R (—k) may differ in phase. 
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Exercise 3.5: Suppose ip(x) obeys a Schrodinger equation 

Hi+m*)-£])*=o. 

a) Make a smooth and invertable change of independent variable by setting 
x = x(z) and find the second order differential equation in z obeyed by 
ip(z) = ip(x(z)). Reduce this equation to normal form, and show that 
the resulting equation is 

{~\^ + ~ E] — \{x,z}^ %(z) = 0, 

where the primes denote differentiation with respect to z, and 

is called the Schwarzian derivative of x with respect to z. Schwarzian 
derivatives play an important role in conformal field theory and string 
theory. 

b) Make a sequence of changes of variable x — > z — > w, and so establish 
Cayley's identity 




{x, z) + {z, w} = {x, w}. 



(Hint: If your proof takes more than one line, you are missing the point.) 



Chapter 4 

Linear Differential Operators 



In this chapter we will begin to take a more sophisticated approach to dif- 
ferential equations. We will define, with some care, the notion of a linear 
differential operator, and explore the analogy between such operators and 
matrices. In particular, we will investigate what is required for a linear dif- 
ferential operator to have a complete set of eigenfunctions. 

4.1 Formal vs. Concrete Operators 

We will call the object 

d n d n ~ l 

which we also write as 

Po(x)d£ +p 1 (x)d%- 1 + ■ ■ -+p n (x), (4.2) 

a formal linear differential operator. The word "formal" refers to the fact 
that we are not yet worrying about what sort of functions the operator is 
applied to. 

4.1.1 The algebra of formal operators 

Even though they are not acting on anything in particular, we can still form 
products of operators. For example if v and w are smooth functions of x we 
can define the operators d x + v(x) and d x + w(x) and find 

(d x + v)(d x + w)=d 2 x + w'+(w + v)d x + vw, (4.3) 
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or 

(d x + w) (d x + v) = d 2 x + v' + (w + v)d x + vw, (4.4) 

We see from this example that the operator algebra is not usually commuta- 
tive. 

The algebra of formal operators has some deep applications. Consider, 
for example, the operators 

L = -d 2 x + q(x) (4.5) 

and 

P = dl + a(x)d x + d x a(x). (4.6) 

In the last expression, the combination d x a(x) means "first multiply by a(x), 
and then differentiate the result," so we could also write 

d x a = ad x + a. (4.7) 

We can now form the commutator [P, L] = PL — LP. After a little effort, 
we find 

[P, L] = (3q' + 4a')d 2 x + (3q" + Aa!')d x + q'" + 2aq' + a'". (4.8) 

If we choose a = — |g, the commutator becomes a pure multiplication oper- 
ator, with no differential part: 

[P,L]= 1 -q"i- 3 -qq>. (4.9) 



f = [P,L], (4.10) 



The equation 
or, equivalently, 
has a formal solution 

L(t) = e tp L(0)e- tp , (4.12) 

showing that the time evolution of L is given by a similarity transformation, 
which (again formally) does not change its eigenvalues. The partial differen- 
tial equation (4.11) is the famous Korteweg de Vries (KdV) equation, which 
has "soliton" solutions whose existence is intimately connected with the fact 
that it can be written as (4.10). The operators P and L are called a Lax 
pair, after Peter Lax who uncovered much of the structure. 
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4.1.2 Concrete operators 

We want to explore the analogies between linear differential operators and 
matrices acting on a finite-dimensional vector space. Because the theory of 
matrix operators makes much use of inner products and orthogonality, the 
analogy is closest if we work with a function space equipped with these same 
notions. We therefore let our differential operators act on L 2 [a, b], the Hilbert 
space of square-integrable functions on [a, b]. Now a differential operator 
cannot act on every function in the Hilbert space because not all of them 
are differentiable. Even though we will relax our notion of differentiability 
and permit weak derivatives, we must at least demand that the domain V, 
the subset of functions on which we allow the operator to act, contain only 
functions that are sufficiently differentiable that the function resulting from 
applying the operator remains an element of L 2 [a, b]. We will usually restrict 
the set of functions even further, by imposing boundary conditions at the 
endpoints of the interval. A linear differential operator is now defined as a 
formal linear differential operator, together with a specification of its domain 
V. 

The boundary conditions that we will impose will always be linear and 
homogeneous . This is so that the domain of definition is a vector space. 
In other words, if y\ and y 2 obey the boundary conditions then so should 
Ayi + /iy 2 . Thus, for a second-order operator 

L = p d 2 x + Pl d x +p 2 (4.13) 

on the interval [a, b], we might impose 

Bi[y] = a u y(a) + a 12 y'(a) + (3 u y(b) + /3i 2 y'(b) = 0, 

B 2 [y] = a 21 y{a) + a 22 y'(a) + B 21 y{b) + f3 22 y\b) = 0, (4.14) 

but we will not, in defining the differential operator, impose inhomogeneous 
conditions, such as 

Bi [y] = a n y(a) + a 12 y'(a) + 8 n y(b) + f3 12 y'(b) = A, 

B 2 [y] = a 2l y{a)+a 22 y , {a) + 8 2l y{b) + 8 22 y , {b) = B, (4.15) 

with non-zero A, B — even though we will solve differential equations with 
such boundary conditions. 
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Also, for an ra-th order operator, we will not constrain derivatives of order 
higher than n — 1. This is reasonable 1 : If we seek solutions of Ly = f with L 
a second-order operator, for example, then the values of y" at the endpoints 
are already determined in terms of y' and y by the differential equation. We 
cannot choose to impose some other value. By differentiating the equation 
enough times, we can similarly determine all higher endpoint derivatives in 
terms of y and y'. These two derivatives, therefore, are all we can fix by fiat. 

The boundary and differentiability conditions that we impose make T> a 
subset of the entire Hilbert space. This subset will always be dense: any 
element of the Hilbert space can be obtained as an L 2 limit of functions in 
T>. In particular, there will never be a function in L 2 [a, b] that is orthogonal 
to all functions in D. 

4.2 The Adjoint Operator 

One of the important properties of matrices, established in the appendix, 
is that a matrix that is self-adjoint, or Hermitian, may be diagonalized. In 
other words, the matrix has sufficiently many eigenvectors for them to form 
a basis for the space on which it acts. A similar property holds for self- 
adjoint differential operators, but we must be careful in our definition of 
self-adjointness. 

Before reading this section, We suggest you review the material on adjoint 
operators on /imte-dimensional spaces that appears in the appendix. 

4.2.1 The formal adjoint 

Given a formal differential operator 

d n d n ~ x 
L=Mx) d^ + Pl{x) d^ + ' ' ' + (416) 

and a weight function w(x), real and positive on the interval (a, b), we can 
find another such operator L\ such that, for any sufficiently differentiable 
u(x) and v(x), we have 

w (u*Lv -v(L^u)*) = ^j-Q[u,v\, (4.17) 
1 There is a deeper reason which we will explain in section 9.7.2. 
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for some function Q, which depends bilinearly on u and v and their first n — 1 
derivatives. We call U the formal adjoint of L with respect to the weight w. 
The equation (4.17) is called Lagrange's identity. The reason for the name 
"adjoint" is that if we define an inner product 

rb 

( u ' v )w = / wu*vdx, (4-18) 

J a 

and if the functions u and v have boundary conditions that make Q[u, v] \ b a = 
0, then 

(u,Lv) w = (L\v) w , (4.19) 

which is the defining property of the adjoint operator on a vector space. The 
word "formal" means, as before, that we are not yet specifying the domain 
of the operator. 

The method for finding the formal adjoint is straightforward: integrate 
by parts enough times to get all the derivatives off v and on to u. 
Example: If 

L = (4.20) 

dx 

then let us find the adjoint V with respect to the weight w = 1. We start 
from 

u*(Lv)=u* (-^), 

and use the integration-by-parts technique once to get the derivative off v 
and onto u*\ 

* f ■ d \ f- d *\ ■ d i * \ 
u —i—v = %—u v — i-r{u v) 

\ dx J \ dx J dx 

■ d Y . d „ 
—i—u v — % — (u V) 
dx J dx 

= v (^uy + ^-Q[u,v}. (4.21) 
We have ended up with the Lagrange identity 
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and found that 

L f = -i-f-, Q[u,v] = -iu*v. (4.23) 
ax 

The operator —id/dx (which you should recognize as the "momentum" op- 
erator from quantum mechanics) obeys L — L\ and is therefore, formally 
self-adjoint, or Hermitian. 
Example: Let 

L = ^ + Pl T x +p * (4 ' 24) 

with the pi all real. Again let us find the adjoint L^ with respect to the inner 
product with w = 1. Now, proceeding as above, but integrating by parts 
twice, we find 

u* [p v" + piv' + p 2 v] - v \(p Q u)" - (piu)' + p 2 u]* 

= ^ [p (u*v' - vu*') + (pi - p' )u*v] . (4.25) 

From this we read off that 

r+ d 2 d 

L = -f^PO - -j-Pl + P2 

= + ( 2 Po ~P^^ + W - Pi + P2). (4.26) 

What conditions do we need to impose on po,i,2 for this L to be formally 
self-adjoint with respect to the inner product with w = 1? For L = we 
need 

Po = Po 
2Po - Pi = Pi Po = Pi 

P0-P1+P2 = P2 Po=Pi- (4.27) 

We therefore require that p\ — p' , and so 

which we recognize as a Sturm- Liouville operator. 

Example: Reduction to Sturm-Liouville form. Another way to make the 
operator 
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self-adjoint is by a suitable choice of weight function w. Suppose that p is 
positive on the interval (a, b), and that po, p±, P2 are all real. Then we may 
define 

-iHfteM (43o) 

and observe that it is positive on (a, b), and that 

Ly= -(wpoy')' +p 2 y. (4.31) 
w 



Now 
where 



(u, Lv) w - (Lu, v) w = [wp (u*v' - u*'v)\l (4.32) 



(u,v) w = / wu*vdx. (4.33) 

J a 

Thus, provided p does not vanish, there is always some inner product with 
respect to which a real second-order differential operator is formally self- 
adjoint. 



Note that with 

the eigenvalue equation 
can be written 



Ly = -(wp y')' +p 2 y, (4.34) 
w 



Ly = \y (4.35) 



{wp y')' + p 2 wy = Xwy. (4.36) 



When you come across a differential equation where, in the term containing 
the eigenvalue A, the eigenfunction is being multiplied by some other function, 
you should immediately suspect that the operator will turn out to be self- 
adjoint with respect to the inner product having this other function as its 
weight. 

Illustration (Bargmann-Fock space): This is a more exotic example of a 
formal adjoint. You may have met with it in quantum mechanics. Consider 
the space of polynomials P(z) in the complex variable z = x + iy. Define an 
inner product by 



(P,Q) = i J d 2 ze- z * z [P(z)]*Q(z), 
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where d 2 z = dxdy and the integration is over the entire x,y plane. With 
this inner product, we have 

(z n ,z m )=n\5 nm . 

If we define 

„ _ d_ 

dz' 

then 

(P,aQ) = i J £ze-*'* \P(z)X ±Q(z) 

= ly (fze-^'z* [P(z)]*Q(z) 
= \f d 2 ze- z * z [zP(z)}*Q(z) 

= @p,Q) 

where a* = z, i.e. the operation of multiplication by z. In this case, the 
adjoint is not even a differential operator. 2 

Exercise 4.1: Consider the differential operator L = id/dx. Find the formal 
adjoint of L with respect to the inner product (u,v) = J wu*vdx, and find 
the corresponding surface term Q[u,v]. 



2 In deriving this result we have used the Wirtinger calculus where z and z* are treated 
as independent variables so that 

^ — z*z * — z* z 

—e = — z e , 
az 

and observed that, because [P(,z)]*is a function of z* only, 

If you are uneasy at regarding z, z* , as independent, you should confirm these formulae 
by expressing z and z* in terms of x and y, and using 

d__lfd_ .d_\ d _ 1 / 8 .d_\ 
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Exercise 4.2:Sturm-Liouville forms. By constructing appropriate weight func- 
tions w(x) convert the following common operators into Sturm-Liouville form: 

a) L = (1 - x 2 ) d 2 /dx 2 + [Qit - v) - (/x + v + 2)x] d/dx. 

b) L = (1 — x 2 ) d 2 /dx 2 - 3x d/dx. 

c) L = d 2 /dx 2 - 2x(l - x 2 )- 1 d/dx - m 2 (1 - a; 2 )" 1 . 

4.2.2 A simple eigenvalue problem 

A finite Hermitian matrix has a complete set of orthonormal eigenvectors. 
Does the same property hold for a Hermitian differential operator? 
Consider the differential operator 



T = -dl V(T) = {y,TyeL 2 [0,l} : y(0) = y(l) = 0}. (4.37) 



The integrated-out part is zero because both y x and y 2 satisfy the boundary 
conditions. We see that 



With the inner product 




(4.38) 



we have 




(4.39) 



(yi,Ty 2 ) = (Ty u y 2 ) 



(4.40) 



and so T is Hermitian or symmetric. 

The eigenfunctions and eigenvalues of T are 




(4.41) 



We see that: 

i) the eigenvalues are real; 

ii) the eigenfunctions for different A n are orthogonal, 




n 



(4.42) 
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hi) the normalized eigenf unctions (p n (x) = ^/2smnrrx are complete: any 
function in L 2 [0, 1] has an (L 2 ) convergent expansion as 

oo 

y(x) = a n V2 sin nirx (4.43) 

n=l 

where 

a n = y( x )V2 sin mrxdx. (4.44) 
Jo 

This all looks very good — exactly the properties we expect for finite Her- 
mitian matrices. Can we carry over all the results of finite matrix theory to 
these Hermitian operators? The answer sadly is no\ Here is a counterexam- 
ple: 

Let 

T=-id x , V(T) = {y,TyeL 2 [0,l] : y(0) = y(l) = 0}. (4.45) 

Again 

(yi,Ty 2 ) - (Ty 1 ,y 2 ) = / dx {yl(-id x y 2 ) - {-id x y 1 )*y 2 } 

Jo 

= = 0. (4.46) 

Once more, the integrated out part vanishes due to the boundary conditions 
satisfied by y\ and y 2 , so T is nicely Hermitian. Unfortunately, T with these 
boundary conditions has no eigenfunctions at all — never mind a complete 
set! Any function satisfying Ty = \y will be proportional to e lXx , but an ex- 
ponential function is never zero, and cannot satisfy the boundary conditions. 

It seems clear that the boundary conditions are the problem. We need 
a better definition of "adjoint" than the formal one — one that pays more 
attention to boundary conditions. We will then be forced to distinguish 
between mere Hermiticity, or symmetry, and true self-adjointness. 

Exercise 4.3: Another disconcerting example. Let p = —id x . Show that the 
following operator on the infinite real line is formally self-adjoint: 

H = x 3 p + px 3 . (4.47) 



Now let 



(4.48) 
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where A is real and positive. Show that 

H4>\ = -iAV-A, (4.49) 

so ipx is an eigenfunction with a purely imaginary eigenvalue. Examine the 
proof that Hermitian operators have real eigenvalues, and identify at which 
point it fails. (Hint: H is formally self adjoint because it is of the form T + T^ . 
Now ipx is square- integrable, and so an element of L 2 (R). Is Ti[)\ an element 
of L 2 (M)?) 



4.2.3 Adjoint boundary conditions 

The usual definition of the adjoint operator in linear algebra is as follows: 
Given the operator T : V — > V and an inner product ( , ), we look at 
(u,Tv), and ask if there is aw such that (w,v) = (u,Tv) for all v. If there 
is, then u is in the domain of T' 1 ", and we set T^u = w. 

For finite-dimensional vector spaces V there always is such a w, and so 
the domain of is the entire space. In an infinite dimensional Hilbert space, 
however, not all (u, Tv) can be written as (w, v) with w a finite-length element 
of L 2 . In particular 5-functions are not allowed — but these are exactly what 
we would need if we were to express the boundary values appearing in the 
integrated out part, Q(u, v), as an inner-product integral. We must therefore 
ensure that u is such that Q(u,v) vanishes, but then accept any u with this 
property into the domain of T' . What this means in practice is that we look 
at the integrated out term Q(u,v) and see what is required of u to make 
Q(u, v) zero for any v satisfying the boundary conditions appearing in V{T). 
These conditions on u are the adjoint boundary conditions, and define the 
domain of T\ 
Example: Consider 



T=-id x , V(T) = {y,TyeL 2 [0,l]:y(l) = 0}. (4.50) 



Now, 



/ dxu*(-id x v) = -i[u*(l)v(l) - u*(0)v(0)} + I dx(-id x u)*v 
Jo Jo 

= -i[u*(l)v(l)-u*(0)v(0)} + (w,v), (4.51) 



where w = —id x u. Since v(x) is in the domain of T, we have v(l) = 0, and 
so the first term in the integrated out bit vanishes whatever value we take 
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for u(l). On the other hand, v(0) could be anything, so to be sure that the 
second term vanishes we must demand that u(0) = 0. This, then, is the 
adjoint boundary condition. It defines the domain of 

T* = -id x , V(T^) = {y,TyeL 2 [0,l]:y(0)=0}. (4.52) 

For our problematic operator 

T = -id x , V(T) = {y,TyeL 2 [0,l] : j/(0) = = 0}, (4.53) 

we have 

/ dxu*(—id x v) = —i[u*v]l + / dx(—id x u)*v 
Jo Jo 

+ (w,v), (4.54) 

where again w = —id x u. This time no boundary conditions need be imposed 
on u to make the integrated out part vanish. Thus 

Tt = -id x , P(Tt) = {y,Ty e L 2 [0, 1]}. (4.55) 

Although any of these operators "T = —id x " is formally self-adjoint we 
have, 

V{T) ± X>(T f ), (4.56) 
so T and T' 1 " are not the same operator and none of them is truly self-adjoint. 

Exercise 4.4: Consider the differential operator M = d 4 /dx 4 , Find the formal 
adjoint of M with respect to the inner product (u, v) = J u*v dx, and find 
the corresponding surface term Q[u,v]. Find the adjoint boundary conditions 
defining the domain of for the case 

V{M) = {y,yW G L 2 [0, 1] : y(0) = y"'(0) = y(l) = y'"(l) = 0}. 

4.2.4 Self-adjoint boundary conditions 

A formally self-adjoint operator T is truly self adjoint only if the domains of 
T' 1 " and T coincide. From now on, the unqualified phrase "self-adjoint" will 
always mean "truly self-adjoint." 

Self-adjointness is usually desirable in physics problems. It is therefore 
useful to investigate what boundary conditions lead to self-adjoint operators. 
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For example, what are the most general boundary conditions we can impose 
on T = — id x if we require the resultant operator to be self-adjoint? Now, 



1 

dxu*(—id T v) — 



o 



J dx{-id x u)*v = -i (u*(l)v(l) - u*(0)v(of) . (4.57) 



Demanding that the right-hand side be zero gives us, after division by u*(0)v(l), 

u*(l) v(0) 



u*(0) v(l)' 



(4.58) 



We require this to be true for any u and v obeying the same boundary 
conditions. Since u and v are unrelated, both sides must equal a constant k, 
and furthermore this constant must obey k* = k~ 1 in order that u(l)/u(0) 
be equal to v (l)/f (0). Thus, the boundary condition is 



v(l) 



(4.59) 



u(0) v(0) 

for some real angle 9. The domain is therefore 

V(T) = {y,TyeL 2 [0,l] : y(l) = e ie y(0)}. (4.60) 

These are twisted periodic boundary conditions. 

With these generalized periodic boundary conditions, everything we ex- 
pect of a self-adjoint operator actually works: 

i) The functions u n = e l ( 2wn + e ) x ^ with n = . . . , —2, —1, 0, 1, 2 . . . are eigen- 
functions of T with eigenvalues k n = 2irn + 9. 

ii) The eigenvalues are real. 

iii) The eigenfunctions form a complete ortho normal set. 

Because self-adjoint operators possess a complete set of mutually orthogo- 
nal eigenfunctions, they are compatible with the interpretational postulates 
of quantum mechanics, where the square of the inner product of a state 
vector with an eigenstate gives the probability of measuring the associated 
eigenvalue. In quantum mechanics, self-adjoint operators are therefore called 
observables. 

Example: The Sturm-Liouville equation. With 

d d 

L = — p(x)— + q(x), xe[a,b], (4.61) 
dx dx 
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we have 

(u, Lv) - (Lu, v) = \p(u*v' - u'*v)] b a . (4.62) 

Let us seek to impose boundary conditions separately at the two ends. Thus, 
at x = a we want 

(u*v' -u'*v)\ a = 0, (4.63) 



or 

u'*(a) v'(a) 
u*(a) via) 



(4.64) 



and similarly at b. If we want the boundary conditions imposed on v (which 
define the domain of L) to coincide with those for u (which define the domain 
of I/t) then we must have 

v'(a) u'(a) „ . , „. 

-^ = -^f = tan# a 4.65 
v(a) u{a) 

for some real angle 9 a , and similar boundary conditions with a 9 b at b. We 
can also write these boundary conditions as 

a a y(a) + f3 a y'{a) = 0, 

a b y(b)+p b y'(b) = 0. (4.66) 
Deficiency indices and self-adjoint extensions 

There is a general theory of self-adjoint boundary conditions, due to Her- 
mann Weyl and John von Neumann. We will not describe this theory in any 
detail, but simply give their recipe for counting the number of parameters 
in the most general self-adjoint boundary condition: To find this number we 
define an initial domain T> (L) for the operator L by imposing the strictest 
possible boundary conditions. This we do by setting to zero the bound- 
ary values of all the y^ with n less than the order of the equation. Next 
count the number of square-integrable eigenfunctions of the resulting adjoint 
operator corresponding to eigenvalue ±i. The numbers, n + and n_, of 
these eigenfunctions are called the deficiency indices. If they are not equal 
then there is no possible way to make the operator self-adjoint. If they are 
equal, n + = n_ = n, then there is an n 2 real-parameter family of self-adjoint 
extensions T>(L) D T> (L) of the initial tightly- restricted domain. 
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Example: The sad case of the "radial momentum operator." We wish to 
define the operator P r = —id r on the half-line < r < oo. We start with the 
restrictive domain 

P r = -id r , V (T) = {y,P r y G L 2 [0,oo] : y(0) = 0}. (4.67) 

We then have 

Pi = -id r , V(Pl) = {y, Ply E L 2 [0, oo]} (4.68) 

with no boundary conditions. The equation PJy = iy has a normalizable 
solution y = e~ r . The equation Ply = —iy has no normalizable solution. 
The deficiency indices are therefore n + = 1, n_ = 0, and this operator 
cannot be rescued and made self adjoint. 

Example: The Schrddinger operator. We now consider — d 2 on the half-line. 
Set 

T = -8l, V (T) = {y,TyeL 2 [0,oc] : y(0) = y'(0) = 0}. (4.69) 
We then have 

T^ = -d 2 x , V(T^ = {y,T^yeL 2 [0,oc]}. (4.70) 

Again T* comes with no boundary conditions. The eigenvalue equation 
T^y = iy has one normalizable solution y(x) = e*^ 1 ^/^, and the equation 
T^y = —iy also has one normalizable solution y(x) = e -( t + 1 ) x /^. The defi- 
ciency indices are therefore n + = n_ = 1. The Weyl-von Neumann theory 
now says that, by relaxing the restrictive conditions y(0) = y'(0) = 0, we 
can extend the domain of definition of the operator to find a one-parameter 
family of self-adjoint boundary conditions. These will be the conditions 
y'(0)/y(0) = tan 9 that we found above. 

If we consider the operator —d 2 . on the finite interval [a, b], then both 
solutions of (T' ± i)y = are normalizable, and the deficiency indices will 
be n + = n_ = 2. There should therefore be 2 2 = 4 real parameters in the 
self-adjoint boundary conditions. This is a larger class than those we found 
in (4.66), because it includes generalized boundary conditions of the form 

Bi [y] = a u y(a) + a 12 y'(a) + p n y(b) + M{b) = 0, 
B 2 [y] = a 2 iy(a) + a 22 y'(a) + (3 21 y(b) + f3 22 y'{b) = 
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Figure 4.1: Hetero junction and wave functions. 



Physics application: semiconductor heterojunction 

We now demonstrate why we have spent so much time on identifying self- 
adjoint boundary conditions: the technique is important in practical physics 
problems. 

A heterojunction is an atomically smooth interface between two related 
semiconductors, such as GaAs and Al^Gai-^As, which typically possess dif- 
ferent band-masses. We wish to describe the conduction electrons by an 
effective Schrodinger equation containing these band masses. What match- 
ing condition should we impose on the wavefunction ip(x) at the interface 
between the two materials? A first guess is that the wavefunction must be 
continuous, but this is not correct because the "wavefunction" in an effective- 
mass band-theory Hamiltonian is not the actual wavefunction (which is con- 
tinuous) but instead a slowly varying envelope function multiplying a Bloch 
wavefunction. The Bloch function is rapidly varying, fluctuating strongly 
on the scale of a single atom. Because the Bloch form of the solution is no 
longer valid at a discontinuity, the envelope function is not even defined in 
the neighbourhood of the interface, and certainly has no reason to be con- 
tinuous. There must still be some linear relation beween the if)'s in the two 
materials, but finding it will involve a detailed calculation on the atomic 
scale. In the absence of these calculations, we must use general principles to 
constrain the form of the relation. What are these principles? 

We know that, were we to do the atomic-scale calculation, the resulting 
connection between the right and left wavefunctions would: 

• be linear, 

• involve no more than ip(x) and its first derivative ip'{x), 

• make the Hamiltonian into a self-adjoint operator. 

We want to find the most general connection formula compatible with these 
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principles. The first two are easy to satisfy. We therefore investigate what 
matching conditions are compatible with self-adjointness. 
Suppose that the band masses are uil and m R , so that 

= -ir R &i + v ^ x>0 - (4 - 71) 

Integrating by parts, and keeping the terms at the interface gives us 

(4.72) 

Here, ipL,R refers to the boundary values of ip immediately to the left or right 
of the junction, respectively. Now we impose general linear homogeneous 
boundary conditions on ip 2 : 



ip2L \ ( a b\ /ip 2R 
Ml) V c d ) W2R 



(4.73) 



This relation involves four complex, and therefore eight real, parameters. 
Demanding that 

(^,H^ 2 ) = (HfrM, (4.74) 

we find 

7T— {^*il(c^2R + #2/?) - 4>'l L (a4>2R + bip' 2R )} = — !— {ip* 1R ip' 2R - ^'* 1r ^2r] , 

(4.75) 

and this must hold for arbitrary ip 2R , i/j' 2R , so, picking off the coefficients of 
these expressions and complex conjugating, we find 

4>ir\ _ ( m R\ ( d* -b* \ ( 4> 1L 



, — , , i , . , . ■ (4.76) 

Vm) \m L J \-c* a* J \ip' 1L J 

Because we wish the domain of W to coincide with that of H, these must 
be same conditions that we imposed on ip 2 . Thus we must have 



a b\ _ f ( d* -b* 
c d) ' \m L J I ~ c * a * 



(4.77) 
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Since 



ad — be 



d 
-c 



we see that this requires 



a b 
c d 



i<t> 



m R 



A B 
C D ' 



(4.78) 



(4.79) 



where 0, A, B, C, D are real, and AD — BC = 1. Demanding self-adjointness 
has therefore cut the original eight real parameters down to four. These 
can be determined either by experiment or by performing the microscopic 
calculation. 3 Note that 4 = 2 2 , a perfect square, as required by the Weyl- 
Von Neumann theory. 

Exercise 4.5: Consider the Schrodinger operator H = —d 2 on the interval 
[0, 1]. Show that the most general self-adjoint boundary condition applicable 
to H can be written as 



" ¥>(0) " 


= e i4> 


a b 






y(o)_ 




c d 




ya). 



where <fi, a, b, c, d are real and ac — bd = 1. Consider H as the quantum 
Hamiltonian of a particle on a ring constructed by attaching x = to x = 1. 
Show that the self-adjoint boundary condition found above leads to unitary 
scattering at the point of join. Does the most general unitary point-scattering 
matrix correspond to the most general self-adjoint boundary condition? 



4.3 Completeness of Eigenfunctions 

Now that we have a clear understanding of what it means to be self-adjoint, 
we can reiterate the basic claim: an operator T that is self-adjoint with 
respect to an L 2 [a, b] inner product possesses a complete set of mutually or- 
thogonal eigenfunctions. The proof that the eigenfunctions are orthogonal 
is identical to that for finite matrices. We will sketch a proof of the com- 
pleteness of the eigenfunctions of the Sturm-Liouville operator in the next 
section. 

The set of eigenvalues is, with some mathematical cavils, called the spec- 
trum of T. It is usually denoted by cr(T). An eigenvalue is said to belong to 



3 For example, see: T. Ando, S. Mori, Surface Science 113 (1982) 124. 
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the point spectrum when its associated eigenfunction is normalizable i.e is 
a bona- fide member of L 2 [a, b] having a finite length. Usually (but not al- 
ways) the eigenvalues of the point spectrum form a discrete set, and so the 
point spectrum is also known as the discrete spectrum. When the opera- 
tor acts on functions on an infinite interval, the eigenfunctions may fail to 
be normalizable. The associated eigenvalues are then said to belong to the 
continuous spectrum. Sometimes, e.g. the hydrogen atom, the spectrum is 
partly discrete and partly continuous. There is also something called the 
residual spectrum, but this does not occur for self-adjoint operators. 



4.3.1 Discrete spectrum 

The simplest problems have a purely discrete spectrum. We have eigenfunc- 
tions <f> n (x) such that 

T<f> n (x) = A n 0„(x), (4.80) 

where n is an integer. After multiplication by suitable constants, the <p n are 
orthonormal, 

c/)* n (x)(f) m (x) dx = 5 nm , (4.81) 



and complete. We can express the completeness condition as the statement 
that 

Y,<t>n{x)<t>* n {x') = 5{x-x'). (4.82) 

n 

If we take this representation of the delta function and multiply it by f(x') 
and integrate over x', we find 

/(*) = J2 Mx) [ ft(*W) ( 4 -83) 

So, 

f(x) = J>rA(z) (4.84) 

n 

with 



<t>* n {x')f{x') dx'. (4.85) 

This means that if we can expand a delta function in terms of the (f> n (x), we 
can expand any (square integrable) function. 
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Figure 4.2: The sum ^ n=1 2 sin(rarx) sii^rarx') for x' = 0.4. Take note of 
the very disparate scales on the horizontal and vertical axes. 

Warning: The convergence of the series Y2 n ( t ) n{x)(j)* n {x l ) to 8{x — x') is 
neither pointwise nor in the L 2 sense. The sum tends to a limit only in the 
sense of a distribution — meaning that we must multiply the partial sums by 
a smooth test function and integrate over x before we have something that 
actually converges in any meaningful manner. As an illustration consider our 
favourite orthonormal set: 0„(x) = y/2sm(nirx) on the interval [0, 1]. A plot 
of the first 70 terms in the sum 



is shown in figure 4.2. The "wiggles" on both sides of the spike at x — 
x' do not decrease in amplitude as the number of terms grows. They do, 
however, become of higher and higher frequency. When multiplied by a 
smooth function and integrated, the contributions from adjacent positive and 
negative wiggle regions tend to cancel, and it is only after this integration 
that the sum tends to zero away from the spike at x — x'. 

Rayleigh-Ritz and completeness 

For the Schrodinger eigenvalue problem 




71=1 




x G [a, b], 



(4.86) 
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the large eigenvalues are X n ~ n 2 n 2 /{a — b) 2 . This is because the term qy 
eventually becomes negligeable compared to Xy, and we can then solve the 
equation with sines and cosines. We see that there is no upper limit to 
the magnitude of the eigenvalues. The eigenvalues of the Sturm-Liouville 
problem 

Ly = -(py')' + qy = ^y, [a, b], (4.87) 

are similarly unbounded. We will use this unboundedness of the spectrum to 
make an estimate of the rate of convergence of the eigenfunction expansion 
for functions in the domain of L, and extend this result to prove that the 
eigenf unctions form a complete set. 

We know from chapter one that the Sturm-Liouville eigenvalues are the 
stationary values of (y, Ly) when the function y is constrained to have unit 
length, (y,y) = 1. The lowest eigenvalue, A , is therefore given by 

Ao= irf (4.88) 

As the variational principle, this formula provides a well-known method of 
obtaining approximate ground state energies in quantum mechanics. Part of 
its effectiveness comes from the stationary nature of (y, Ly) at the minimum: 
a crude approximation to y often gives a tolerably good approximation to Ao- 
In the wider world of eigenvalue problems, the variational principle is named 
after Rayleigh and Ritz. 4 

Suppose we have already found the first n normalized eigenfunctions 
2/0)2/1; • • • )2/n-i- Let the space spanned by these functions be V n . Then an 
obvious extension of the variational principle gives 

X n = mf (4.89) 

yzv± {y,y) 

We now exploit this variational estimate to show that if we expand an arbi- 
trary y in the domain of L in terms of the full set of eigenfunctions y m , 

00 

y = ^2 a ™ym, (4.90) 

m=0 



4 J. W. Strutt (later Lord Rayleigh), "In Finding the Correction for the Open End of 
an Organ-Pipe." Phil. Trans. 161 (1870) 77; W. Ritz, "Uber eine neue Methode zur 
Losung gewisser Variationsprobleme der mathematischen Physik." J. reine angew. Math. 
135 (1908). 
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where 

a m = {y m ,y), (4.91) 

then the sum does indeed converge to y. 
Let 

n-1 
m=0 

be the residual error after the first n terms. By definition, h n G V^- Let 
us assume that we have adjusted, by adding a constant to q if necessary, L 
so that all the X m are positive. This adjustment will not affect the y m . We 
expand out 

n— 1 

(h n ,Lh n ) = (y,Ly) - ^ \ m \a m \ 2 , (4.93) 

m=0 

where we have made use of the orthonormality of the y m . The subtracted 
sum is guaranteed positive, so 

(h n ,Lh n ) < (y,Ly). (4.94) 

Combining this inequality with Rayleigh-Ritz tells us that 

{y> L v) > (hn,Lh n ) > ^ ^ 
(h n , h n ) {h n , h n ) 

In other words 

(y,Ly) 



A, 



m=0 



Since (y, Ly) is independent of n, and A n — > oo, we have \\y — J^o^ 1 a my m \\ 2 —> 
Thus the eigenfunction expansion indeed converges to y, and does so faster 
than A" 1 goes to zero. 

Our estimate of the rate of convergence applies only to the expansion of 
functions y for which (y,Ly) is defined — i.e. to functions y G T>[V). The 
domain V (L) is always a dense subset of the entire Hilbert space L 2 [a, b] , 
however, and, since a dense subset of a dense subset is also dense in the larger 
space, we have shown that the linear span of the eigenf unctions is a dense 
subset of L 2 [a, b]. Combining this observation with the alternative definition 
of completeness in 2.2.3, we see that the eigenfunctions do indeed form a 
complete orthonormal set. Any square integrable function therefore has a 
convergent expansion in terms of the y m , but the rate of convergence may 
well be slower than that for functions y G V (L). 
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Operator methods 

Sometimes there are tricks for solving the eigenvalue problem. 
Example: Quantum Harmonic Oscillator. Consider the operator 

H = (-d x +x)(d x + x) + l = -dl + x 2 . (4.97) 

This is in the form Q^Q + 1, where Q = (d x + x), and = {—d x + x) is its 
formal adjoint. If we write these operators in the opposite order we have 

QQ ] = (d x + x)(-d x + x) = -d 2 x + x 2 + 1 = H + 1. (4.98) 

Now, if tp is an eigenfunction of Q^Q with non-zero eigenvalue A then Qip is 
eigenfunction of QQ^ with the same eigenvalue. This is because 

Q}Qi) = Xip (4.99) 

implies that 

Q{Q ] Q^) = AQV, (4.100) 

or 

QQ\Q4>) = A(Q^). (4.101) 

The only way that Qip can fail to be an eigenfunction of QQ^ is if it happens 
that Qip = 0, but this implies that Q^Qip = and so the eigenvalue was zero. 
Conversely, if the eigenvalue is zero then 

= (^,Q^Q4>) = (Q4>,Q^), (4.102) 

and so Qip = 0. In this way, we see that Q}Q and QQ} have exactly the 
same spectrum, with the possible exception of any zero eigenvalue. 
Now notice that Q}Q does have a zero eigenvalue because 

ip = e"^ 2 (4.103) 

obeys Qipo = and is normalizable. The operator QQ^ , considered as an 
operator on L 2 [— oo, oo], does not have a zero eigenvalue because this would 
require Q^ip = 0, and so 

i> = e + ^ 2 , (4.104) 

which is not normalizable, and so not an element of L 2 [— oo, oo]. 
Since 

H = Q^Q + 1 =QQ t -l, (4.105) 
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we see that ipo is an eigenfunction of H with eigenvalue 1, and so an eigen- 
function of QQ^ with eigenvalue 2. Hence Q'tfto is an eigenfunction of Q^Q 
with eigenvalue 2 and so an eigenfunction H with eigenvalue 3. Proceeding 
in the way we find that 

= {Q ] T% (4.106) 

is an eigenfunction of H with eigenvalue 2n + 1. 
Since = —e^ x d x e~^ x , we can write 

i/> n (x) = H n {x)e~^ 2 , (4.107) 

where 

An 

H n (x) = {-Vfe^—e-^ (4.108) 

are the Hermite Polynomials. 

This is a useful technique for any second-order operator that can be fac- 
torized — and a surprising number of the equations for "special functions" 
can be. You will see it later, both in the exercises and in connection with 
Bessel functions. 



Exercise 4.6: Show that we have found all the eigenfunctions and eigenvalues 
of H = —d% + x 2 . Hint: Show that Q lowers the eigenvalue by 2 and use the 
fact that Q^Q cannot have negative eigenvalues. 

Problem 4. 7: Schrodinger equations of the form 



dx 2 



1(1 + l)sech i xV' = Eip 



are known as Poschel- Teller equations. By setting u = Ztanhx and following 
the strategy of this problem one may relate solutions for I to those for / — 1 and 
so find all bound states and scattering eigenfunctions for any integer I. 

a) Suppose that we know that ip = exp {— f x u(x')dx'} is a solution of 



V2 



Show that L can be written as L = M^M where 

M=(^ + « W ), Mt=(-! + „(x>), 

the adjoint being taken with respect to the product (u,v) = f u*vdx. 
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b) Now assume L is acting on functions on [—00, 00] and that we not have 
to worry about boundary conditions. Show that given an eigenfunction 
ip- obeying M^Mip- = \ip- we can multiply this equation on the left 
by M and so find a eigenfunction ip + with the same eigenvalue for the 
differential operator 

(-s +«(*)) 

and vice-versa. Show that this correspondence ip- <-> if) + will fail if, and 
only if, A = 0. 

c) Apply the strategy from part b) in the case u(x) = tanhx and one of the 
two differential operators M^M, MM^ is (up to an additive constant) 

d 2 2 

H = — — -2sech 2 x. 
dx 

Show that H has eigenfunctions of the form ip^ = e tkx P(t&nh.x) and 
eigenvalue E = k 2 for any k in the range — 00 < k < 00. The function 
P(tanhx) is a polynomial in tanhx which you should be able to find 
explicitly. By thinking about the exceptional case A = 0, show that H 
has an eigenfunction ipo(%), with eigenvalue E = — 1, that tends rapidly 
to zero as x — > ±00. Observe that there is no corresponding eigenfunction 
for the other operator of the pair. 



4.3.2 Continuous spectrum 

Rather than a give formal discussion, we will illustrate this subject with some 
examples drawn from quantum mechanics. 

The simplest example is the free particle on the real line. We have 

H = -8 2 x . (4.109) 

We eventually want to apply this to functions on the entire real line, but we 
will begin with the interval [— L/2, L/2], and then take the limit L — > 00 
The operator H has formal eigenfunctions 

Mx)=e zkx , (4.110) 

corresponding to eigenvalues A = k 2 . Suppose we impose periodic boundary 
conditions at x — ±L/2: 



M~L/2) = M+L/2). (4.111) 
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This selects k n = 2nn/L, where n is any positive, negative or zero integer, 
and allows us to find the normalized eigenfunctions 

Xn (x) = -^jf^- ( 4 - 112 ) 

The completeness condition is 

oo 1 

-e iknX e- iknX ' = 5(x - x'), x, x' G [-L/2, L/2]. (4.113) 

n=— oo 

As L becomes large, the eigenvalues become so close that they can hardly be 
distinguished; hence the name continuous spectrum, 5 and the spectrum o~(H) 
becomes the entire positive real line. In this limit, the sum on n becomes an 
integral 



E 



dn{...\= I dki^f] {...}, (4.114) 



dkj 



where 



(4.115) 



dn L 
~dk ~ 2tt 

is called the (momentum) density of states. If we divide this by L to get a 
density of states per unit length, we get an L independent "finite" quantity, 
the local density of states. We will often write 

f k =,(*)• (4.116) 

If we express the density of states in terms of the eigenvalue A then, by 
an abuse of notation, we have 



5 When L is strictly infinite, tpk(x) is no longer normalizable. Mathematicians do not 
allow such un-normalizable functions to be considered as true eigenfunctions, and so a 
point in the continuous spectrum is not, to them, actually an eigenvalue. Instead, they 
say that a point A lies in the continuous spectrum if for any e > there exists an ap- 
proximate eigenfunction ip e such that \\<p e \\ = 1, but \\L(p e — Xip e \\ < e. This is not a 
profitable definition for us. We prefer to regard non-normalizable wavefunctions as being 
distributions in our rigged Hilbert space. 
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Note that 

dn dndk . 

Tx = 2 TkTy < 4n8 > 

which looks a bit weird, but remember that two states, ±k n , correspond to 
the same A and that the symbols 

dn dn , , _ „ _ 

w Tx < 4 - 119 > 

are ratios of measures, i.e. Radon-Nikodym derivatives, not ordinary deriva- 
tives. 

In the L — > oo limit, the completeness condition becomes 

/°° dk 
—e ik ^=5(x~x'), (4.120) 
-oo 27T 

and the length L has disappeared. 

Suppose that we now apply boundary conditions y = on x = ±L/2. 
The normalized eigenfunctions are then 

Xn = ^sink n {x + L/2), (4.121) 

where k n = nir/L. We see that the allowed /c's are twice as close together as 
they were with periodic boundary conditions, but now n is restricted to being 
a positive non-zero integer. The momentum density of states is therefore 

m = | 4 (4,22) 

which is twice as large as in the periodic case, but the eigenvalue density of 
states is 

„(A) = (4,23) 

which is exactly the same as before. 

That the number of states per unit energy per unit volume does not 
depend on the boundary conditions at infinity makes physical sense: no 
local property of the sublunary realm should depend on what happens in 
the sphere of fixed stars. This point was not fully grasped by physicists, 
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however, until Rudolph Peierls 6 explained that the quantum particle had to 
actually travel to the distant boundary and back before the precise nature 
of the boundary could be felt. This journey takes time T (depending on 
the particle's energy) and from the energy-time uncertainty principle, we 
can distinguish one boundary condition from another only by examining the 
spectrum with an energy resolution finer than h/T. Neither the distance nor 
the nature of the boundary can affect the coarse details, such as the local 
density of states. 

The dependence of the spectrum of a general differential operator on 
boundary conditions was investigated by Hermann Weyl. Weyl distinguished 
two classes of singular boundary points: limit- circle, where the spectrum 
depends on the choice of boundary conditions, and limit-point, where it does 
not. For the Schrodinger operator, the point at infinity, which is "singular" 
simply because it is at infinity, is in the limit-point class. We will discuss 
Weyl's theory of singular endpoints in chapter 8. 

Phase-shifts 

Consider the eigenvalue problem 



on the interval [0, R], and with boundary conditions ^(0) = = 4>(R). This 
problem arises when we solve the Schrodinger equation for a central potential 
in spherical polar coordinates, and assume that the wavefunction is a function 
of r only (i.e. S-wave, or / = 0). Again, we want the boundary at R to be 
infinitely far away, but we will start with it! at a large but finite distance, 
and then take the R — > oo limit. Let us first deal with the simple case that 
V(r) = 0; then the solutions are 



with eigenvalue E = k 2 , and with the allowed values of being given by 
k n R = rnr. Since 




(4.124) 



^/c(r) oc sin kr, 



(4.125) 




(4.126) 



6 Peierls proved that the phonon contribution to the specific heat of a crystal could be 
correctly calculated by using periodic boundary conditions. Some sceptics had thought 
that such "unphysical" boundary conditions would give a result wrong by factors of two. 
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the normalized wavefunctions are 



ijjk = \ / — sin fcr, 




(4.127) 



and completeness reads 

°° f 2 \ 

\ I — I sin(fc n r) sm(k n r') = 5(r — r') 




(4.128) 



n=l 



As i? becomes large, this sum goes over to an integral: 



m 

n=l x ' 





sin(A;r) sin(/cr')(4.129) 



Thus, 




dk sin(fcr) sin(fcr') = 5(r — r'). (4.130) 



As before, the large distance, here R, no longer appears. 

Now consider the more interesting problem which has the potential V(r) 
included. We will assume, for simplicity, that there is an Rq such that V(r) 
is zero for r > R . In this case, we know that the solution for r > R is of 
the form 



where the phase shift t](k) is a functional of the potential V. The eigenvalue 
is still E = k 2 . 

Example: A delta-function shell. We take V(r) = X5(r — a). See figure 4.3. 



^(r) = sin (kr + r](k)) , 



(4.131) 



X8(r-a) 




r 



Figure 4.3: Delta function shell potential. 
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A solution with eigenvalue E = k 2 and satisfying the boundary condition at 
r = is 

Msin^r) r<a, 
I sm(/cr + T]), r > a. 

The conditions to be satisfied at r = a are: 

i) continuity, ip(a — e) = ip(a + e) = ip(a), and 

ii) jump in slope, —ip'(a + e) + ip\a — e) + \ip{a) = 0. 
Therefore, 

V>'(a + e) tp'(a-e) 



or 



Thus, 



and 



kcos(ka + r]) kcos(ka) 



sin(A;a + rf) sin(fca) 



A, (4.133) 
= A. (4.134) 



cot(A;a + 77) — cot(A;a) = — , (4.135) 

k 

rj(k) = -ka + cot -1 ^ + cot kaj . (4.136) 



2tz 3k 4tc 




Figure 4.4: The phase shift r](k) of equation (4.136) plotted against ka. 



A sketch of t](k) is shown in figure 4.4. The allowed values of k are required 
by the boundary condition 



sm(kR + ri(k)) = 



(4.137) 
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to satisfy 



kR + n(k) = nix. 



(4.138) 



This is a transcendental equation for k, and so finding the individual solutions 
k n is not simple. We can, however, write 



n 



71 



kR + rj(k) 



(4.139) 



and observe that, when R becomes large, only an infinitesimal change in k 
is required to make n increment by unity. We may therefore regard 
"continuous" variable which we can differentiate with respect to k to find 



dn 
dk 



(4.140) 



The density of allowed k values is therefore 

For our delta-shell example, a plot of p(k) appears in figure 4.5. 



(4.141) 



(R-a)/7C 



2tc 



371 



ka 





Figure 4.5: The density of states for the delta-shell potential. The extended 
states are so close in energy that we need an optical aid to resolve individual 
levels. The almost-bound resonance levels have to squeeze in between them. 



This figure shows a sequence of resonant bound states at ka = nir superposed 
on the background continuum density of states appropriate to a large box of 
length (R — a). Each "spike" contains one extra state, so the average density 
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of states is that of a box of length R. We see that changing the potential 
does not create or destroy eigenstates, it just moves them around. 

The spike is not exactly a delta function because of level repulsion between 
nearly degenerate eigenstates. The interloper elbows the nearby levels out of 
the way, and all the neighbours have to make do with a bit less room. The 
stronger the coupling between the states on either side of the delta-shell, the 
stronger is the inter-level repulsion, and the broader the resonance spike. 

Normalization factor 

We now evaluate 

dr\tp k \ 2 = N^ 2 , (4.142) 
so as to find the the normalized wavefunctions 



Xk = N k if> k . (4.143) 

Let ipk(f) be a solution of 

d 2 . „, A , , 2 



Hip = + V(r)j i> = k ip (4.144) 

satisfying the boundary condition ^(0) = 0, but not necessarily the bound- 
ary condition at r = R. Such a solution exists for any k. We scale ip k by 
requiring that ip k (r) = sin(/cr + 77) for r > R . We now use Lagrange's 
identity to write 

t-R rR 

(k 2 -k' 2 )j drip k if) k/ = / dr {(Hifj k )ifj k > - ifj k (Hip k/ )} 
Jo Jo 

= [i>ki)' kl - ViVvlo 

= sin(k R + r])k' cos (k'R + i]) 

-kcos(kR + r])sin(k'R + r]). (4.145) 

Here, we have used ip k , k '{Q) — 0, so the integrated out part vanishes at the 
lower limit, and have used the explicit form of ipk,k' a t the upper limit. 
Now differentiate with respect to k, and then set k = k' . We find 

2k jf dr(4> k ) 2 = ~ sin (2(kR + rjj) + k |i? + . (4.146) 



4.3. COMPLETENESS OF EIGENF UNCTIONS 



143 



In other words, 



At this point, we impose the boundary condition at r = R. We therefore 
have kR + rj — nir and the last term on the right hand side vanishes. The 
final result for the normalization integral is therefore 



R dr \^\ 2 = \\ R +^\- ( 4 - 148 ) 



Observe that the same expression occurs in both the density of states 
and the normalization integral. When we use these quantities to write down 
the contribution of the normalized states in the continuous spectrum to the 
completeness relation we find that 



dk (^j N 2 k Mr)Mr') = 0) jH dkM^M^ (4-149) 

the density of states and normalization factor having cancelled and disap- 
peared from the end result. This is a general feature of scattering problems: 
The completeness relation must give a delta function when evaluated far from 
the scatterer where the wavefunctions look like those of a free particle. So, 
provided we normalize ipk so that it reduces to a free particle wavefunction 
at large distance, the measure in the integral over k must also be the same 
as for the free particle. 

Including any bound states in the discrete spectrum, the full statement 
of completeness is therefore 

V " Mr)Mr') + f - J / dkMr)Mr') =S(r-r'). (4.150) 



bound states 



Example: We will exhibit a completeness relation for a problem on the entire 
real line. We have already met the Poschel- Teller equation, 

Hif>= ( --^-1(1 + I) sech 2 x\ = EtP (4.151) 



in exercise 4.7. When / is an integer, the potential in this Schrodinger equa- 
tion has the special property that it is reflectionless. 
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The simplest non-trivial example is I — 1. In this case, H has a single 
discrete bound state at E = — 1. The normalized eigenfunction is 

ip {x) = -|=sechx. (4.152) 
v2 

The rest of the spectrum consists of a continuum of unbound states with 
eigenvalues E(k) = k 2 and eigenf unctions 

ip k (x) = - 1 e ikx (-ik + tanhx). (4.153) 
V 1 + k 2 

Here, k is any real number. The normalization of ipk(x) has been chosen so 
that, at large \x\, where tanhx — > ±1, we have 

^(x)^(x') -> e - lk(x - x '\ (4.154) 

The measure in the completeness integral must therefore be dk/2ir, the same 
as that for a free particle. 

Let us compute the difference 



/°° dk 
-oo 27r 



oo 

'OO 



2tt 

dk 1 + ifc(tanh a; — tanh x') — tanh x tanh a/ 



2vr e 1 + k 



(4.155) 



We use the standard integral, 



r°° dk 



-ik(x-x') _ \ -\x-x'\ /a 1 rc\ 

1 + k 2 2 ' 1 ' 



together with its x' derivative, 



'OO 



dk - ik (x-x = S gn(x-x')-e-\ x - x \ (4.157) 



to find 



/ = -jl + sgn (x — x')(tanhx — tanhx') — tanhx tanhx' je x '. (4.158) 
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Assume, without loss of generality, that x > x'; then this reduces to 

1 / 1 

-(1 + tanhx)(l — tanhxOe - ^ -21 '-* = -sechxsechx' 

2 V A 1 2 

= 1> {x)MJ)- ( 4 - 159 ) 

Thus, the expected completeness condition 

/°° dk 
—r k (x)Mx')=5(x-x'), (4.160) 

is confirmed. 

4.4 Further Exercises and Problems 

We begin with a practical engineering eigenvalue problem. 

Exercise 4.8: Whirling drive shaft. A thin flexible drive shaft is supported by 
two bearings that impose the conditions x' = y' = x = y = at at z = ±L. 
Here x{z), y(z) denote the transverse displacements of the shaft, and the 
primes denote derivatives with respect to z. 





ill 


n 


i x 






> 






IT y/ 









z 



Figure 4.6: The n = 1 even-parity mode of a whirling shaft. 



The shaft is driven at angular velocity uj. Experience shows that at certain 
critical frequencies uj n the motion becomes unstable to whirling — a sponta- 
neous vibration and deformation of the normally straight shaft. If the rotation 
frequency is raised above u n , the shaft becomes quiescent and straight again 
until we reach a frequency w n +i) at which the pattern is repeated. Our task 
is to understand why this happens. 
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The kinetic energy of the whirling shaft is 



T=\ J ^{i 2 + y 2 }dz, 



and the strain energy due to bending is 



V[x,y] = \J L i{{x"f + {y") 2 }dz. 



a) Write down the Lagrangian, and from it obtain the equations of motion 
for the shaft. 

b) Seek whirling-mode solutions of the equations of motion in the form 

x(z,t) = 1p(z) COS Ldt, 

y(z,t) = ip{z)smujt. 
Show that this quest requires the solution of the eigenvalue problem 

7 d A ij} 



p dz 4 



^(-L) = i,(-L) = ^\L)=^{L) = d. 



c) Show that the critical frequencies are given in terms of the solutions £ n 
to the transcendental equation 

tanh £ n = ± tan £„, (★) 

as 

Ff ( & 



p \L 

Show that the plus sign in * applies to odd parity modes, where ■ip(z) = 
—ip(—z), and the minus sign to even parity modes where ip(z) = ip(—z). 

Whirling, we conclude, occurs at the frequencies of the natural transverse 
vibration modes of the elastic shaft. These modes are excited by slight imbal- 
ances that have negligeable effect except when the shaft is being rotated at 
the resonant frequency. 



Insight into adjoint boundary conditions for an ODE can be obtained by 
thinking about how we would impose these boundary conditions in a numer- 
ical solution. The next exercise problem this. 
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Problem 4.9: Discrete approximations and self-adjointness. Consider the sec- 
ond order inhomogeneous equation Lu = u" = g(x) on the interval <x <1. 
Here g(x) is known and u(x) is to be found. We wish to solve the problem on a 
computer, and so set up a discrete approximation to the ODE in the following 
way: 

• replace the continuum of independent variables < x < 1 by the discrete 
lattice of points < x n = (n — \)/N < 1. Here N is a positive integer 
and n = 1, 2, . . . , N; 

• replace the functions u(x) and g(x) by the arrays of real variables u n = 
u(x n ) and g n = g(x n ); 

• replace the continuum differential operator d 2 /dx 2 by the difference op- 
erator V 2 , defined by V 2 u n = u n+ i — 2u n + u n -\. 

Now do the following problems: 

a) Impose continuum Dirichlet boundary conditions u(0) = u(l) = 0. De- 
cide what these correspond to in the discrete approximation, and write 
the resulting set of algebraic equations in matrix form. Show that the 
corresponding matrix is real and symmetric. 

b) Impose the periodic boundary conditions u(0) = u(l) and u'(0) = u'(l), 
and show that these require us to set uq = and u^ + \ = u±. Again 
write the system of algebraic equations in matrix form and show that 
the resulting matrix is real and symmetric. 

c) Consider the non-symmetric iV-by-iV matrix operator 



D 2 u = 
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i) What vectors span the null space of D 2 ? 

ii) To what continuum boundary conditions for d 2 /dx 2 does this matrix 
correspond? 

hi) Consider the matrix (D 2 )^, To what continuum boundary condi- 
tions does this matrix correspond? Are they the adjoint boundary 
conditions for the differential operator in part ii)? 

Exercise 4.10: Let 

ft ( -id x mi - im 2 \ 
\mi+ im 2 id x J 
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= —ia^dx + ui\G\ + m 2 <7 2 

be a one-dimensional Dirac Hamiltonian. Here m\(x) and m<i{x) are real 
functions and the <7j are the Pauli matrices. The matrix differential operator 
H acts on the two-component "spinor" 



*(x) 



V'i(x) 

-02 (x) J ' 



a) Consider the eigenvalue problem H^> = on the interval [a, b]. Show- 
that the boundary conditions 

= exp{i6 a }, 'p-j^r = exp{i9 b } 

where 6 a , 9b are real angles, make H into an operator that is self-adjoint 
with respect to the inner product 



rb 

<*i,* 2 > = / V[(x)V 2 (x)dx. 

J a 



b) Find the eigenfunctions \I/ n and eigenvalues E n in the case that m\ = 
vri2 = and the 9 a ^ are arbitrary real angles. 

Here are three further problems involving the completeness of operators with 
a continuous spectrum: 

Problem 4.11: Missing State. In problem 4.7 you will have found that the 
Schrodinger equation 

d 2 \ 
■—. r - 2sech 2 x ip = Eil> 
dx A ) 

has eigensolutions 

ipk{x) = e tkx (-ik + tanhx) 

with eigenvalue E = k 2 . 

• For x large and positive ipk(x) ~ Ae lkx e ir) ^ k \ while for x large and neg- 
ative ipk(x) ~ Ae lkx e~ iV ( k \ the (complex) constant A being the same 
in both cases. Express the phase shift r](k) as the inverse tangent of an 
algebraic expression in k. 
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• Impose periodic boundary conditions ip(— L/2) = ip(+L/2) where L 3> 1. 
Find the allowed values of k and hence an explicit expression for the k- 
space density, p(k) = of the eigenstates. 

• Compare your formula for p(k) with the corresponding expression, po(k) = 
L/2n, for the eigenstate density of the zero-potential equation and com- 
pute the integral 

/oo 
{p(k) - Po (k)}dk. 
-co 

• Deduce that one eigenfunction has gone missing from the continuum and 
become the localized bound state ipo(x) = -^sechx. 



Problem 4.12: Continuum Completeness. Consider the differential operator 

L = —-^~2, < x < oo 

with self-adjoint boundary conditions ip(0)/ip' (0) = tan 9 for some fixed angle 
6. 

• Show that when tan < there is a single normalizable negative-eigenvalue 
eigenfunction localized near the origin, but none when tan 6 > 0. 

• Show that there is a continuum of positive-eigenvalue eigenfunctions of 
the form i^k{ x ) = sin(/cx + n{k)) where the phase shift r] is found from 

e i v (k) = 1 + ifctan6> 



Vl + A^tanV 



Write down (no justification required) the appropriate completeness re- 
lation 

5(x-x') = j ^-Nfy k {x)Mx')dk+ Y, ipn(x)i/j n {x') 

bound 

with an explicit expression for the product (not the separate factors) of 
the density of states and the normalization constant N%, and with the 
correct limits on the integral over k. 

Confirm that the tpk continuum on its own, or together with the bound 
state when it exists, form a complete set. You will do this by evaluating 
the integral 



/ 2 f°° , 
I(x,x) = — / s'm(kx + n{k)) sin (kx +rj(k))dk 
n Jo 
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and interpreting the result. You will need the following standard integral 



dk 
— i 
2tt 



ikx 



1 



1 



1 + kH 2 2\t\ 



-\x\/\t\ 



Take care! You should monitor how the bound state contribution switches 
on and off as 9 is varied. Keeping track of the modulus signs | . . . | in the 
standard integral is essential for this. 

Problem 4.13: One-dimensional scattering redux. Consider again the one- 
dimensional Schrodinger equation from chapter 3 problem 3.4: 



dx 2 



+ V{x)il) = Eifi, 



where V(x) is zero except in a finite interval [—a, a] near the origin. 

V(x) 



CI 



.out 
L 

in 



a 




a 



out 



a 



R 

in 
R 



-a 



a 



R 



Figure 4.7: Incoming and outgoing waves in problem 4.13. The asymptotic 
regions L and R are defined by L = {x < —a} and R = {x > a}. 



For k > 0, consider solutions of the form 



afe ikx + a 



out ikx 



m„—ikx 



a R e 



+ d 



out ikx 
R e ) 



x £ L, 
x £ R. 



a) Show that, in the notation of problem 3.4, we have 



-.out 

l L 

,out 



r L (k) t R (-k) 
t L (k) r R (-k) 



and show that the 5-matrix 

S(k) = 

is unitary. 



r L (k) t R (-k) 
t L (k) r R (-k) 
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b) By observing that complex conjugation interchanges the "in" and "out" 
waves, show that it is natural to extend the definition of the transmission 
and reflection coefficients to all real k by setting rL,R(k) = r* LR (—k), 
tL,R(k)=t% R (-k). 

c) In problem 3.4 we introduced the particular solutions 

tb,(x) ~ [e lkx +r L (k)e- lkx , xGL, 

= (t R (k)e ikx , xeL, k<Q 

\ e lkx + r R (k)e~ lkx , x G R. 

Show that, together with any bound states ip n (x), these tpk{x) satisfy 
the completeness relation 

Ef°° dk 
i;*n(x)Mx') + / ^t(x)Mx') = 5(x - x 1 ) 

bound 271 

provided that 



£ r n {x)Mx) = j™^ rL{k)e ~ ik{x+x) ' ^' eL > 

t L {k)e~ ik{x ~ x '\ x £ L, x 1 G R, 



bound J _oc 



j — < 



' — oo 

y 00 dk 

J —oo 
oo 



27T 

, t R (k)e- ik ^ x - x '\ x E R, 

Z7T 

^r^^e"^ 4 ^, x,x'Gi?. 

d) Compute rL,n(k) and tL,n{k) for the potential V(x) = — A<5(x),and verify 
that the conditions in part c) are satisfied. 

If you are familiar with complex variable methods, look ahead to chapter 
?? where problem ??.?? shows you how to use complex variable methods to 
evaluate the Fourier transforms in part c), and so confirm that the bound state 
ip n {x) and the tpk{ x ) together constitute a complete set of eigenfunctions. 

Problem 4.14: Levinson's Theorem and the Friedel sum rule. The interaction 
between an attractive impurity and (5- wave, and ignoring spin) electrons in 
a metal can be modelled by a one-dimensional Schrodinger equation 

-^ + V ( r )x = k\. 
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Here r is the distance away from the impurity and V(r) is the (spherically 
symmetric) impurity potential and xi r ) = V^nr-ipfr) where tp(r) is the three- 
dimensional wavefunction. The impurity attracts electrons to its vicinity. Let 
Xk( r ) = sin(Ax) denote the unperturbed wavefunction, and Xk(f) denote the 
perturbed wavefunction that beyond the range of impurity potential becomes 
s'm(kr + r](k)). We fix the 2mr ambiguity in the definition of i]{k) by taking 
77(00) to be zero, and requiring r](k) to be a continuous function of k. 

• Show that the continuous-spectrum contribution to the change in the 
number of electrons within a sphere of radius R surrounding the impurity 
is given by 

- ( f R {\Xk(x)\ 2 - \xl(x)\ 2 } dr) dk=- [r)(k f ) - r / (0)]+oscillations. 
71 Jo \Jo J 7r 

Here kf is the Fermi momentum, and "oscillations" refers to Friedel oscil- 
lations pa cos(2(kjR + r/)). You should write down an explicit expression 
for the Friedel oscillation term, and recognize it as the Fourier transform 
of a function oc k^ 1 sinr/(/c). 

• Appeal to the Riemann-Lebesgue lemma to argue that the Friedel density 
oscillations make no contribution to the accumulated electron number in 
the limit R — > 00. 

(Hint: You may want to look ahead to the next part of the problem in 
order to show that /c _1 sin rj(k) remains finite as k —> 0.) 

The impurity-induced change in the number of unbound electrons in the in- 
terval [0, R] is generically some fraction of an electron, and, in the case of 
an attractive potential, can be negative — the phase-shift being positive and 
decreasing steadily to zero as k increases to infinity. This should not be sur- 
prising. Each electron in the Fermi sea speeds up as it enters an attractive 
potential well, spends less time there, and so makes a smaller contribution 
to the average local density than it would in the absence of the potential. 
We would, however, surely expect an attractive potential to accumulate a net 
positive number of electrons. 

• Show that a negative continuous-spectrum contribution to the accumu- 
lated electron number is more than compensated for by a positive number 

iVbound = / (po(fc) - P{k))dk = - ~J^dk = -7/(0). 

JO Jo IT OK 7T 

of electrons bound to the potential. After accounting for these bound 
electrons, show that the total number of electrons accumulated near the 
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impurity is 

Qtot = -v(kf)- 

IT 

This formula (together its higher angular momentum versions) is known 
as the Friedel sum rule. The relation between 77(0) and the number of 
bound states is called Levinson's theorem. A more rigorous derivation 
of this theorem would show that ry(0) may take the value (n + l/2)ir 
when there is a non-normalizable zero-energy "half-bound" state. In 
this exceptional case the accumulated charge will depend on R. 
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Chapter 5 
Green Functions 



In this chapter we will study strategies for solving the inhomogeneous linear 
differential equation Ly = f. The tool we use is the Green function, which 
is an integral kernel representing the inverse operator L^ 1 . Apart from their 
use in solving inhomogeneous equations, Green functions play an important 
role in many areas of physics. 

5.1 Inhomogeneous Linear equations 

We wish to solve Ly = f for y. Before we set about doing this, we should 
ask ourselves whether a solution exists, and, if it does, whether it is unique. 
The answers to these questions are summarized by the Fredholm alternative. 

5.1.1 Fredholm alternative 

The Fredholm alternative for operators on a finite-dimensional vector space 
is discussed in detail in the appendix on linear algebra. You will want to 
make sure that you have read and understood this material. Here, we merely 
restate the results. 

Let V be finite-dimensional vector space equipped with an inner product, 
and let A be a linear operator A : V — > V on this space. Then 

I. Either 

i) Ax = b has a unique solution, 

or 

ii) Ax = has a non-trivial solution. 
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II. If Ax = has n linearly independent solutions, then so does A^x = 0. 
III. If alternative ii) holds, then Ax = b has no solution unless b is perpen- 
dicular to all solutions of A^x = 0. 
What is important for us in the present chapter is that this result continues 
to hold for linear differential operators L on a finite interval — provided that 
we define as in the previous chapter, and provided the number of boundary 
conditions is equal to the order of the equation. 

If the number of boundary conditions is not equal to the order of the 
equation then the number of solutions to Ly = and Vy = will differ in 
general. It is still true, however, that Ly = f has no solution unless / is 
perpendicular to all solutions of L^y = 0. 

Example: As an illustration of what happens when an equation with too 
many boundary conditions, consider 

L V=% y(0)=y(l) = 0. (5.1) 

Clearly Ly = has only the trivial solution y = 0. If a solution to Ly = f 
exists, therefore, it will be unique. 

We know that = —d/dx, with no boundary conditions on the functions 
in its domain. The equation L^y = therefore has the non-trivial solution 
y — 1. This means that there should be no solution to Ly = f unless 

(1,/)= f fdx = 0. (5.2) 

If this condition is satisfied then 

y(x)= / f(x)dx (5.3) 
Jo 

satisfies both the differential equation and the boundary conditions at x — 
0, 1. If the condition is not satisfied, y(x) is not a solution, because y(l) ^ 0. 

Initially we only solve Ly = f for homogeneous boundary conditions. 
After we have understood how to do this, we will extend our methods to deal 
with differential equations with inhomogeneous boundary conditions. 

5.2 Constructing Green Functions 

We will solve Ly — f, a differential equation with homogeneous boundary 
conditions, by finding an inverse operator L -1 , so that y = L~ 1 f. This 
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inverse operator L 1 will be represented by an integral kernel 

(L-^^GfoO, (5-4) 

with the property 

L x G(x,S)=6(x-£). (5.5) 

Here, the subscript x on L indicates that L acts on the first argument, x, of 
G. Then 

y(x) = J G(x,0/(Ode (5-6) 

will obey 

= y L x G{x, 0/(0 ^ = 1 - 0/(0 # = /(*)■ (5-7) 

The problem is how to construct G(x,£). There are three necessary ingredi- 
ents: 

• the function x( x ) = G(x,C,) must have some discontinuous behaviour 
at x = £ in order to generate the delta function; 

• away from x — £, the function x(x) must obey Lx = 0; 

• the function x( x ) m ust obey the homogeneous boundary conditions 
required of y at the ends of the interval. 

The last ingredient ensures that the resulting solution, y(x), obeys the bound- 
ary conditions. It also ensures that the range of the integral operator G lies 
within the domain of L, a prerequisite if the product LG — I is to make 
sense. The manner in which these ingredients are assembled to construct 
G(x,£) is best explained through examples. 

5.2.1 Sturm-Liouville equation 

We begin by constructing the solution to the equation 

(p(x) y y + q(x)y(x) = f(x) (5.8) 
on the finite interval [a, b] with homogeneous self-adjoint boundary conditions 



(5.9) 
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We therefore seek a function G(x,£) such that x( x ) — G(x,£) obeys 

l x = (px')' + qx = s(x-0, (5.10) 

The function x( x ) m ust also obey the homogeneous boundary conditions we 
require of y(x). 

Now (5.10) tells us that x( x ) must be continuous at x — £. For if not, the 
two differentiations applied to a jump function would give us the derivative 
of a delta function, and we want only a plain S(x — £). If we write 

then x( x ) 1S automatically continuous at x — £. We take ul(x) to be a 
solution of Ly = 0, chosen to satisfy the boundary condition at the left hand 
end of the interval. Similarly Ur(x) should solve Ly = and satisfy the 
boundary condition at the right hand end. With these choices we satisfy 
(5.10) at all points away from x — £. 

To work out how to satisfy the equation exactly at the location of the 
delta-function, we integrate (5.10) from £ — e to £ + e and find that 

p(0lx(Z + e)-x'(Z-e)] = l (5.12) 
With our product form for x( x )> this jump condition becomes 

a p (0 (ydOvRiO - y' L (0y R (0) = i (5.13) 

and determines the constant A. We recognize the Wronskian W(yL,yR]£) 
on the left hand side of this equation. We therefore have A = 1/ (pW) and 

\^l(0^), ^>C- y J 

For the Sturm-Liouville equation the product pW is constant. This fact 
follows from Liouville's formula, 

W(x) = W(0)exp |- jT X ^ d^j , (5.15) 

and from pi = p' Q = p' in the Sturm-Liouville equation. Thus 

W(x) = W(0) exp(- m[p(x)/p(0)]) = W^(0)^. (5.16) 
v / p[x) 
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The constancy of pW means that G(x, £) is symmetric: 

G(x,0=G(£,x). (5.17) 

This is as it should be. The inverse of a symmetric matrix (and the real, 
self-adjoint, Sturm-Liouville operator is the function-space analogue of a real 
symmetric matrix) is itself symmetric. 
The solution to 

Ly = (poyj + qy = f(x) (5.18) 



is therefore 



y(x) = ^ J y R (Of(0 di + y R {x) J* y L (£)/(£) <%\ ■ (5.19) 

Take care to understand the ranges of integration in this formula. In the 
first integral £ > x and we use G(x, £) oc yL(x)y R (^). m the second integral 
£ < x and we use G(x,£) oc yL{C)y R {x). ft i s eas y to get these the wrong 
way round. 

Because we must divide by it in constructing G(x,£), it is necessary that 
the Wronskian W(yL,y R ) not be zero. This is reasonable. If W were zero 
then y L oc y R , and the single function y R satisfies both Ly R = and the 
boundary conditions. This means that the differential operator L has y R as 
a zero-mode, so there can be no unique solution to Ly = f. 
Example: Solve 

-d 2 x y = f(x), y(0)=y(l) = 0. (5.20) 



We have 



We find that 



Vl = x 
y R = 1 - x 



y' L y R - y L y' R = i. (5.21; 



G(x,0 = (^! ^' X< f (5.22) 
v \£(l-x), x>£, v ; 




Figure 5.1: The function x{ x ) = G( x ,£) 
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and 



y 




5.2.2 Initial- value problems 

Initial value problems are those boundary-value problems where all boundary 
conditions are imposed at one end of the interval, instead of some conditions 
at one end and some at the other. The same ingredients go into to construct- 
ing the Green function, though. 
Consider the problem 



and G(0,f) = 0. 

We need x(t) — G{t,t') to satisfy L t x = 0, except at t — t', and need 
x(0) = 0. The unique solution of L t x = with x(0) = is x(t) = 0. This 
means that G(t,0) = for all t < t'. Near t — t' we have the jump condition 



dy_ 

dt 



Q(t)y = F(t), j/(0) = 0. 



(5.24) 



We seek a Green function such that 




(5.25) 



G(t' + e, t') - G(t' 



e,t') = l. 



(5.26) 



The unique solution is 




(5.27) 



where 6(t — t') is the Heaviside step distribution 




t < 0, 
t > 0. 



(5.28) 
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G(t,f) 



t' 

Figure 5.2: The Green function G(t,t') for the first-order initial value problem 



Therefore 

yit) 



G{t,t')F{t')dt' 



J exp jjf Q(s)ds^F{t')dt' 



exp < / Q(s) ds > exp 



J Q{s)ds\ F{t')dt'. (5.29) 



We earlier obtained this solution via variation of parameters. 

Example: Forced, Damped, Harmonic Oscillator. An oscillator obeys the 

equation 

x + 2 7 x + (tt 2 + 7 2 )x = F(t). (5.30) 

Here 7 > is the friction coeffecient. Assuming that the oscillator is at rest 
at the origin at t — 0, we will show that 



x(t) 



n 



- 7 (t-r) 



smQ(t - T)F(r)dr. 



(5.31) 



We seek a Green function G(t, r) such that x{t) — G(t, r) obeys x(0) = 
x'(0) = 0- Again, the unique solution of the differential equation with this 
initial data is x(t) = 0. The Green function must be continuous at t — r, 
but its derivative must be discontinuous there, jumping from zero to unity 
to provide the delta function. Thereafter, it must satisfy the homogeneous 
equation. The unique function satisfying all these requirements is 



G(t, r) = 6(t - t)^-^-^ sin n(t - t). 



(5.32) 



162 



CHAPTER 5. GREEN FUNCTIONS 



G(t,T) 



T 




t 



Figure 5.3: The Green function G(t, r) for the damped oscillator problem . 

Both these initial-value Green functions G(t,t') are identically zero when 
t < t! . This is because the Green function is the response of the system to a 
kick at time t = t', and in physical problems no effect comes before its cause. 
Such Green functions are said to be causal. 

Physics application: friction without friction — the Caldeira-Leggett 
model in real time. 

We now describe an application of the initial-value problem Green function 
we found in the preceding example. 

When studying the quantum mechanics of systems with friction, such as 
the viscously damped oscillator, we need a tractable model of the dissipative 
process. Such a model was introduced by Caldeira and Leggett. 1 They 
consider the Lagrangian 



l = \ (q 2 - (n 2 - An 2 )Q 2 ) - Q J2M + J2l fe 2 ~ "hi) > ( 5 - 33 ) 



which describes a macroscopic variable Q(t), linearly coupled to an oscillator 
bath of very many simple systems g« representing the environment. The 
quantity 




(5.34) 



: A. Caldiera, A. J. Leggett, Phys. Rev. Lett. 46 (1981) 211. 
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is a counter-term that is inserted cancel the frequency shift 

Q2 - Q2 - E (§>) . ( 5 - 35 ) 



caused by the coupling to the bath. 2 
The equations of motion are 

Qi + wfo + fiQ = °- ( 5 - 36 ) 
Using our initial- value Green function, we solve for the in terms of 

/ift = - y f ^) sinc^t - T)Q(r)dT. (5.37) 
The resulting motion of the feeds back into the equation for Q to give 



Q + (Q 2 -AQ 2 )Q + j F(t — t)Q(t ) dr = 0, 



(5.38) 



where 



F(t) = - J2 (—) sin(^0 (5.39) 

is a memory function. 

It is now convenient to introduce a spectral function 

= |£(£W-"i), (5-40) 

which characterizes the spectrum of couplings and frequencies associated 
with the oscillator bath. In terms of J(ou) we can write 

F(t) = / J(uj) sin(cut) duj. (5.41) 

Jo 



2 The shift arises because a static Q displaces the bath oscillators so that /jg, = 
— (f?/u)?)Q. Substituting these values for the fiqi into the potential terms shows that, in 
the absence of Af2 2 Q 2 , the effective potential seen by Q would be 

I^q 2 + q E /i« + E ^ = U^ 2 - E (S) ) Q2 - 
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Although J(uj) is defined as a sum of delta function "spikes," the oscillator 
bath contains a very large number of systems and this makes J(u) effectively 
a smooth function. This is just as the density of a gas (a sum of delta 
functions at the location of the atoms) is macroscopically smooth. By taking 
different forms for J{uj) we can represent a wide range of environments. 
Caldeira and Leggett show that to obtain a friction force proportional to 
Q we should make J(ou) proportional to the frequency u. To see how this 
works, consider the choice 



J{iij) = TjUJ 



A 2 



A 2 + u 2 



(5.42) 



which is equal to rju for small iv, but tends to zero when uj » A. The 
high-frequency cutoff A is introduced to make the integrals over oj converge. 
With this cutoff 

- \ J(u) sin(wf) duj=— ' J - dcu = sgn (t)ri AV^'I . (5.43) 

7T J 2lXl A 1 + UJ 2 

Therefore, 

/ F(t-T)Q(r)dT = -f r]A 2 e~ Alt ' rl Q(T)dT 

= - r] AQ(t)+r ] Q(t)-^ K Q(t) + ---, (5.44) 

where the second line results from expanding Q(t) as a Taylor series 

Q(t) = Q(t) + (t - t)Q(t) + ■ ■ ■ , (5.45) 
and integrating term-by-term. Now, 

_ Atf s E (I) = i r ^ = i r ^L_ dL0 = nA . (5 , 6) 

*Jo * nj A 2 + u 2 

The — AQ 2 Q counter-term thus cancels the leading term —i]AQ(t) in (5.44), 
which would otherwise represent a A-dependent frequency shift. After this 
cancellation we can safely let A — > oo, and so ignore terms with negative 
powers of the cutoff. The only surviving term in (5.44) is then r/Q. This 
we substitute into (5.38), which becomes the equation for viscously damped 
motion: 

Q + V Q + n 2 Q = 0. (5.47) 
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The oscillators in the bath absorb energy but, unlike a pair of coupled oscil- 
lators which trade energy rhythmically back-and-forth, the incommensurate 
motion of the many qi prevents them from cooperating for long enough to 
return any energy to Q(t). 



5.2.3 Modified Green function 

When the equation Ly = has a non trivial-solution, there can be no unique 
solution to Ly = f, but there still will be solutions provided / is orthogonal 
to all solutions of L^y = 0. 
Example: Consider 

Ly=-%y = f(x), 1/(0) = y\l) = 0. (5.48) 

The equation Ly = has one non-trivial solution, y(x) = 1. The operator 
L is self-adjoint, V 1 = L, and so there will be solutions to Ly = f provided 

{i,f) = £fdx = o. 

We cannot define the the green function as a solution to 

-d 2 x G(x,0 = S(x-0, (5-49) 
because S(x — £) dx = 1 ^ 0, but we can seek a solution to 

-d 2 x G(x,0=S(x-0-l (5-50) 

as the right-hand integrates to zero. 
A general solution to — d\y = — 1 is 



y = A + Bx + -x 2 , (5.51) 
2 



and the functions 



Vl = A+^x 2 , 

yR = c-x + ^x 2 , (5.52) 

obey the boundary conditions at the left and right ends of the interval, re- 
spectively. Continuity at x = £ demands that A = C — £, and we are left 
with 

G ^ = {c-i + + 1 €i (5 ' 53) 
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There is no freedom left to impose the condition 

-£,£)- <?'(£ + £, = 1, (5.54) 
but it is automatically satisfied*. Indeed, 

G'(£ + e,Z) = (5.55) 
We may select a different value of C for each £, and a convenient choice 

is 

which makes G symmetric: 
G(x,£) = 
It also makes G(a;, 











5 



Figure 5.4: Tie modified Green function. 

The solution to Ly = f is 

j/(z) = f 1 G(x,0f(0^ + A, (5.58) 
Jo 

where A is arbitrary. 



C 



(5.56) 




< x < £ 
£ < x < 1, 



(5.57) 
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5.3 Applications of Lagrange's Identity 
5.3.1 Hermiticity of Green function 

Earlier we noted the symmetry of the Green function for the Sturm-Liouville 
equation. We will now establish the corresponding result for general differ- 
ential operators. 

Let G(x,£) obey L x G(x,£) = S(x — £) with homogeneous boundary con- 
ditions B, and let G^(x,£) obey L\.G\x, £) = — £) with adjoint boundary 
conditions . Then, from Lagrange's identity, we have 

[Q(G,G*)] b a = j'dx{(LlG\x,0)*G(x,e)-(G\x,t)yLG(x,a} 

= J" dx [s(x - OG(x, O - (G%, 0)*S(x - O } 

= G^O-^.O)*- (5-59) 

Thus, provided G^)]„ = 0, which is indeed the case because the bound- 

ary conditions for L, L^ are mutually adjoint, we have 

G*(t,x) = (5-60) 

and the Green functions, regarded as matrices with continuous rows and 
columns, are Hermitian conjugates of one another. 
Example: Let 

L = ^, Z>(L) = {y, Ly G L 2 [0, 1] : y(0) = 0}. (5.61) 

In this case G(x,£) = 9(x — £). 
Now, we have 

Lt = P(L) = {y, Ly G L 2 [0, 1] : y(l) = 0} (5.62) 

and Gt(x,f) = 0(f - a). 
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£ 10 £ 1 

Figure 5.5: G(x,£) = 9(x - £), andG\x,£) = 6(£-x). 



5.3.2 Inhomogeneous boundary conditions 

Our differential operators have been defined with linear homogeneous bound- 
ary conditions. We can, however, use them, and their Green-function in- 
verses, to solve differential equations with inhomogeneous boundary condi- 
tions. 

Suppose, for example, we wish to solve 



-d 2 x y = f(x), ?/(0) = a, y(l) = b. 



(5.63) 



We already know the Green function for the homogeneous boundary-condition 
problem with operator 



L = -d 2 x , V(L) = {y,LyeL 2 [0,l]:y(0) = 0,y(l) = 0}. (5.64) 



It is 



G(x,0 



x{l-0, x<£, 



x > f. 

Now we apply Lagrange's identity to x( x ) = G(x,£) and y(x) to get 



(5.65) 



dx{G(x,t)(-%y(xj) - y(x) (-^G(x, 0) } = [&{x, £)y{x)-G(x, t)y'{x)\l. 

(5.66) 

Here, as usual, G'(x,£) = d x G(x,£). The integral is equal to 



dx {G(x, 0/(x) - y(x)5(x - £)} = / G(x, 0/(x) dx - y(0, (5.67) 



(5.6* 



whilst the integrated-out bit is 

-(l- 02/(o) -oi/'(o)-^(i) + oy'(i; 
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Therefore, we have 

y(0 = J G(x, Of(x) dx + (l- 0y(0) + £y(l). (5-69) 

Here the term with f(x) is the particular integral, whilst the remaining terms 
constitute the complementary function (obeying the differential equation 
without the source term) which serves to satisfy the boundary conditions. 
Observe that the arguments in G(x,£) are not in the usual order, but, in the 
present example, this does not matter because G is symmetric. 

When the operator L is not self-adjoint, we need to distinguish between 
L and L\ and G and G^ . We then apply Lagrange's identity to the unknown 
function u(x) and x( x ) = G ( x iQ- 

Example: We will use the Green-function method to solve the differential 
equation 

du 

— = f(x), xe[0,l], u(0) = a. (5.70) 

We can, of course, write down the answer to this problem directly, but it 
is interesting to see how the general strategy produces the solution. We 
first find the Green function G(x, £) for the operator with the corresponding 
homogeneous boundary conditions. In the present case, this operator is 

L = d x , V(L) = {u, Lu E L 2 [0, 1] : u(0) = 0}, (5.71) 

and the appropriate Green function is G(x,£) = 6{x — £). From G we then 

read off the adjoint Green function as G^(x,£) = ^G(^,x)j . In the present 

example, we have G^(x,' x) = 9(£ — x). We now use Lagrange's identity in 
the form 

J\x {(LlG\x,0T u(x) - (G\x,0Y LM*)} = [Q (G\u)]l (5.72) 

In all cases, the left hand side is equal to 

f dx {5{x - £)u(x) - G T (x, Of(x) } , (5.73) 
Jo 

where T denotes transpose, G T (x,£) = G(£,x). The left hand side is there- 
fore equal to 

u(0~ [ dxG(C,x)f(x). (5.74) 
Jo 
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The right hand side depends on the details of the problem. In the present 
case, the integrated out part is 

[Q(G\u)]l = - [G T (x,Ou(x)}l = «(0). (5.75) 

At the last step we have used the specific form G T (x, £) = 0(£ — x) to find 
that only the lower limit contributes. The end result is therefore the expected 
one: 

rv 

u(y) =u(0) + / f(x)dx. (5.76) 
Jo 

Variations of this strategy enable us to solve any inhomogeneous boundary- 
value problem in terms of the Green function for the corresponding homoge- 
neous boundary-value problem. 



5.4 Eigenfunction Expansions 

Self-adjoint operators possess a complete set of eigenfunctions, and we can 
expand the Green function in terms of these. Let 

L(p n = A n y?„. (5.77) 

Let us further suppose that none of the A n are zero. Then the Green function 
has the eigenfunction expansion 

= (5.78) 



n 



That this is so follows from 



An / — ' Ar 



n 

5(x-0- (5.79) 
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Example: : Consider our familiar exemplar 

L = -8l V(L) = {y,LyeL 2 [0,l]:y(0)=y(l) = 0}, (5.80) 
for which 

Computing the Fourier series shows that 

G{x, = [~^\ sm(nnx) sin(mrO. (5.82) 

n=l ^ ' 



Modified Green function 

When one or more of the eigenvalues is zero, a modified Green function is 
obtained by simply omitting the corresponding terms from the series. 



Then 



A„=0 



We see that this G mo d is still hermitian, and, as a function of x, is orthogonal 
to the zero modes. These are the properties we elected when constructing 
the modified Green function in equation (5.57). 



5.5 Analytic Properties of Green Functions 

In this section we study the properties of Green functions considered as 
functions of a complex variable. Some of the formulae are slightly easier to 
derive using contour integral methods, but these are not necessary and we will 
not use them here. The only complex- variable prerequisite is a familiarity 
with complex arithmetic and, in particular, knowledge of how to take the 
logarithm and the square root of a complex number. 
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5.5.1 Causality implies analyticity 

Consider a Green function of the form G(t — t) and possessing the causal 
property that G(t — r) = 0, for t < r. If the improper integral defining its 
Fourier transform, 

G(u) = e lwt G{t) dt = \im j jf e ibJt G{t) J , (5.85) 

converges for real u, it will converge even better when u> has a positive 
imaginary part. Consequently G(cu) will be a well-behaved function of the 
complex variable u everywhere in the upper half of the complex u plane. 
Indeed, it will be analytic there, meaning that its Taylor series expansion 
about any point actually converges to the function. For example, the Green 
function for the damped harmonic oscillator 

G{t)= rie-*sin(nt), t>0 (5g6) 
I j t < C j 

has Fourier transform 

which is always finite in the upper half-plane, although it has pole singulari- 
ties at uj — —i'-y ± Q in the lower half-plane. 

The only way that the Fourier transform G of a causal Green function can 
have a pole singularity in the upper half-plane is if G contains a exponential 
factor growing in time, in which case the system is unstable to perturbations 
(and the real- frequency Fourier transform does not exist). This observation 
is at the heart of the Nyquist criterion for the stability of linear electronic 
devices. 

Inverting the Fourier transform, we have 

G(t) = £ Tw^kw^' % = m ^ J ' anm (5 ' 88) 

It is perhaps surprising that this integral is identically zero if t < 0, and 
non-zero if t > 0. This is one of the places where contour integral methods 
might cast some light, but because we have confidence in the Fourier inversion 
formula, we know that it must be correct. 
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Remember that in deriving (5.88) we have explicitly assumed that the 
damping coefficient 7 is positive. It is important to realize that reversing the 
sign of 7 on the left-hand side of (5.88) does more than just change e -7 * — > e 7 * 
on the right-hand side. Naively setting 7 — > —7 on both sides of (5.88) gives 
an equation that cannot possibly be true. The left-hand side would be the 
Fourier transform of a smooth function, and the Riemann-Lebesgue lemma 
tells us that such a Fourier transform must become zero when |t| — ► 00. The 
right-hand side, to the contrary, would be a function whose oscillations grow 
without bound as t becomes large and positive. 

To find the correct equation, observe that we can legitimately effect the 
sign-change 7 — > —7 by first complex-conjugating the integral and then 
changing t to —t. Performing these two operations on both sides of (5.88) 
leads to 

/OO 1 7 1 

,, fr = -*(-V*sin W (5.89) 

The new right-hand side represents an exponentially growing oscillation that 
is suddenly silenced by the kick at t — 0. 




vf= +ie iy=-ie 



Figure 5.6: The effect on G(t), the Green function of an undamped oscillator, 
of changing i^y from +ie to —is. 

The effect of taking the damping parameter 7 from an infitesimally small 
postive value e to an infinitesimally small negative value — e is therefore to 
turn the causal Green function (no motion before it is started by the delta- 
function kick) of the undamped oscillator into an anti-causal Green function 
(no motion after it is stopped by the kick). Ultimately, this is because the the 
differential operator corresponding to a harmonic oscillator with initial-value 
data is not self-adjoint, and its adjoint operator corresponds to a harmonic 
oscillator with final- value data. 
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This discontinuous dependence on an infinitesimal damping parameter is 
the subject of the next few sections. 

Physics application: Caldeira-Leggett in frequency space 

If we write the Caldeira-Leggett equations of motion (5.36) in Fourier fre- 
quency space by setting 

Q(t) = ^Q(w)e^, (5.90) 

and 

Qi(t) = f ^q t (uj)e-^, (5.91) 
we have (after including an external force F ext to drive the system) 
(-o; 2 + (Q 2 -Afi 2 ))g(a;)-J]/^H = F cxt (u), 

i 

(-cu 2 + w?)?iM + fiQ(u) = 0. (5.92) 
Eliminating the q iy we obtain 

(_^2 + (Q 2 _ A Q2^ Q{u) _ J- -JL-Q^) = F cxt (uj). (5.93) 



As before, sums over the index i are replaced by integrals over the spectral 
function 

ft 2 r 



and 



Then 



Q ^> = I v-J+iim I F «<">' (5 - 96) 



where the self-energy H(u) is given by 

iku)- 2 n j{uj,) ^i i Ay- r J(uj,) 



(5.97) 
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The expression 



l 



(5.98) 



n 2 - uj 2 + 

a typical response function. Analogous objects occur in all branches of 
physics. 

For viscous damping we know that J(to) = rjoo. Let us evaluate the 
integral occuring in H(co) for this case: 



I(uj) 



duj' 



to — uj 



(5.99) 



We will initially assume that uj is positive. Now, 



1 



1 



1 



1 



uj' — uj 2 2uj \ uj' — uj uj' + UJ 



so 



(5.100) 
(5.101) 



u'=0 



— ( ln(a;' — to) — ln(a;' + to 
2,uj V 

At the upper limit we have In ^(oo — uj) /(oo + uj)^j = lnl = 0. The lower 

-i-(ln(-a;)-ln(a;)). (5.102) 



limit contributes 



To evaluate the logarithm of a negative quantity we must use 

\nuj = In |a; | + i argo;, 
where we will take argcu to lie in the range —it < argcu < it. 

Im co 



Re CO 



(5.103) 




Figure 5.7: When uj has a small positive imaginary part, arg (—uj) w — ir. 
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To get an unambiguous answer, we need to give uj an infinitesimal imaginary 
part ±ie. Depending on the sign of this imaginary part, we find that 

I(w±ie)=±—. (5.104) 

This formula remains true when the real part of uj is negative, and so 

U(cu±ie) =Tiv^- (5.105) 
Now the frequency-space version of 

Q{t)+ v Q + Q 2 Q = F cxt {t) (5.106) 

is 

(-uj 2 - irju; + n 2 )Q(uj) = F cxt (uj), (5.107) 

so we must opt for the small shift in uj that leads to IT(cj) = —ir/uj. This 
means that we must regard uj as having a positive infinitesimal imaginary 
part, uj — > uj + is. This imaginary part is a good and needful thing: it effects 
the replacement of the ill-defined singular integrals 

poo 1 

Git) I / — 9 -e- wt du, (5.108) 

Jo < - ^ 2 

which arise as we transform back to real time, with the unambiguous expres- 
sions 



G e(t)= 2 - ( - . ^ e'^du. (5.109) 

JO - [UJ + IE) 2 

The latter, we know, give rise to properly causal real-time Green functions. 
5.5.2 Plemelj formulae 

The functions we are meeting can all be cast in the form 

/(") = - f4^dJ. (5.110) 

If ui lies in the integration range [a, b], then we divide by zero as we integrate 
over uj' = uj. We ought to avoid doing this, but this interval is often exactly 
where we desire to evaluate /. As before, we evade the division by zero by 
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giving lu an infintesimally small imaginary part: lu — ► lu ± ze. We can then 
apply the Plemelj formulas, named for the Slovenian mathematician Josip 
Plemelj, which say that 

-(/(w + ie)-/(w-ie)) = z/o(w), 

I(/( w + fe)+/(a;-i e )) = ^pf-^ldu'. (5.111) 

2 V / IT J T LU — LU 

As explained in section 2.3.2, the "P" in front of the integral stands for 
principal part. Recall that it means that we are to delete an infinitesimal 
segment of the to' integral lying symmetrically about the singular point lu' = 

CO. 

Re© L 03 

a b 

vWWVvVWWW " Imco 



Figure 5.8: The analytic function f(to) is discontinuous across the real axis 
between a and b. 

The Plemelj formula mean that the otherwise smooth and analytic func- 
tion f(co) is discontinuous across the real axis between a and b. If the dis- 
continuity p(co) is itself an analytic function then the line joining the points 
a and b is a branch cut, and the endpoints of the integral are branch-point 
singularities of f(co). 

The reason for the discontinuity may be understood by considering figure 
5.9. The singular integrand is a product of p(co') with 

1 co 1 — co ie , 

,.v» ■ -, ■ (5-112) 



to' -(lu± ie) (lu' - lu) 2 + e 2 (lu' - lu) 2 + e 

The first term on the right is a symmetrically cut-off version 1/(lu' — co) and 
provides the principal part integral. The the second term sharpens and tends 
to the delta function ±in6(Lu' — lu) as e — >• 0, and so gives ±.mp(uu). Because 
of this explanation, the Plemelj equations are commonly encoded in physics 
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papers via the "ie" cabbala 

1 



u' — (u) ± ie) 



P 



uj 1 — UJ 



± vk5(u' — uj). 



(5.113) 





CO 



*co' 



Figure 5.9: Sketch of the real and imaginary parts of g{u)') = l/(uj'—(uj+ie)). 

If p is real, as it often is, then f{uo+in) = ^f(uj—ir))j . The discontinuity 
across the real axis is then purely imaginary, and 



l(f( u + ie) + f(u-ie) 
is the real real part of /. In this case we can write (5.110) as 



Re/M = -P 



(5.114) 



(5.115) 



This formula is typical of the relations linking the real and imaginary parts 
of causal response functions. 

A practical example of such a relation is provided by the complex, frequency- 
dependent, refractive index, n(ui), of a medium. This is defined so that a 
travelling electromagnetic wave takes the form 



E(x,t) = E e in{Lu)kx " iujt . 



(5.116) 



Here, k = u/c is the in vacuuo wavenumber. We can decompose n into its 
real and imaginary parts: 



muj) 



n R + in! 



Kr{u) + 2jjP"7(w) s 



(5.117) 
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where 7 is the extinction coefficient, defined so that the intensity falls off 
as / = Iq exp(— 7x). A non-zero 7 can arise from either energy absorbtion 
or scattering out of the forward direction. For the refractive index, the 
function f{uj) = n{uj) — 1 can be written in the form of (5.110), and, using 
n(— uj) = n*(u), this leads to the Kramers- Kronig relation 



Formulae like this will be rigorously derived in chapter ?? by the use of 
contour-integral methods. 

5.5.3 Resolvent operator 

Given a differential operator L, we define the resolvent operator to be R\ = 
(L — A/) -1 . The resolvent is an analytic function of A, except when A lies in 
the spectrum of L. 

We expand R\ in terms of the eigenfunctions as 



When the spectrum is discrete, the resolvent has poles at the eigenvalues 
L. When the operator L has a continuous spectrum, the sum becomes an 
integral: 



where p(fj) is the eigenvalue density of states. This is of the form that 
we saw in connection with the Plemelj formulae. Consequently, when the 
spectrum comprises segements of the real axis, the resulting analytic function 
R\ will be discontinuous across the real axis within them. The endpoints 
of the segements will branch point singularities of R\, and the segements 
themselves, considered as subsets of the complex plane, are the branch cuts. 
The trace of the resolvent Tr R\ is defined by 




(5.118) 




(5.119) 




(5.120) 
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dp. (5.121) 



A n — A 

pip) 
p — A 

Applying Plemelj to R\, we have 

Im [lim[lV J R A+i£ }] = 7179(A). (5.122) 

Here, we have used that fact that p is real, so 

TrR x _ l£ = (TrR x+te y. (5.123) 

The non-zero imaginary part therefore shows that R\ is discontinuous across 
the real axis at points lying in the continuous spectrum. 
Example: Consider 

L = -d 2 x + m 2 , T>(L) = {y, Ly e L 2 [—oo, oo]}. (5.124) 

As we know, this operator has a continuous spectrum, with eigenf unctions 

<p k = 4=e fa . (5.125) 

Here, L is the (very large) length of the interval. The eigenvalues are E = 
k 2 + m 2 , so the spectrum is all positive numbers greater than m 2 . The 
momentum density of states is 

p(k) = (5.126) 

The completeness relation is 

^e^ = 5(x-0, (5.127) 

which is just the Fourier integral formula for the delta function. 
The Green function for L is 

G(x - y) = r dk m fpm = r = _L e -*-», 

J-oo \dk ) k 2 + m 2 J_ O0 2irk 2 + m 2 2m 

(5.128) 
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Im X 




Figure 5.10: If Im A > 0, and with the branch cut for \fz in its usual place 
along the negative real axis, then \/—\ has negative imaginary part and 
positive real part. 



We can use the same calculation to look at the resolvent R\ = (—9%. — A) 1 . 
Replacing m 2 by —A, we have 

R x (x,y) = _L^-V=A|*-y| (5 . 129) 
2 y — A 

To appreciate this expression, we need to know how to evaluate yfz where 
z is complex. We write z = \z\e l( ^ where we require — tc < <fi < tt. We now 
define 

v /^= v ^ e ^ /2 . (5.130) 

When we evaluate yfz for z just below the negative real axis then this defini- 
tion gives —i^ l /\z\, and just above the axis we find The discontinuity 
means that the negative real axis is a branch cut for the the square-root func- 
tion. The \/— A's appearing in R\ therefore mean that the positive real axis 
will be a branch cut for R\. This branch cut therefore coincides with the 
spectrum of L, as promised earlier. 
If A is positive and we shift A — > A + is then 

\ r -V^-X\x-y\ _^ _J_ e +iV\\x~y\-e\x-y\/2V\ _ (5.131) 



2 V /Z A 2V\ 

Notice that this decays away as \x — y\ — ► oo. The square root retains a 
positive real part when A is shifted to A — is, and so the decay is still present: 

2 V /Z A ' 
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In each case, with A either immediately above or immediately below the 
cut, the small imaginary part tempers the oscillatory behaviour of the Green 
function so that x( x ) = G(x, y) is square integrable and remains an element 
ofL 2 [R]. 

We now take the trace of R by setting x = y and integrating: 



TtR 



\+ie 



ITT- 



2tV|A| 



(5.133) 



Thus, 



p(\) = e(\)- 



which coincides with our direct calculation. 
Example: Let 

L = -id x , V(L) = {y,LyeL 2 [R}}. 



(5.134) 



(5.135) 



This has eigenfunctions e lkx with eigenvalues k. The spectrum is therefore 
the entire real line. The local eigenvalue density of states is 1/2-7T. The 
resolvent is therefore 



1 f°° 1 



(5.136) 



To evaluate this, first consider the Fourier transforms of 



Fi(x) 
F 2 (x) 



9(x)e~ KX , 
-9(-x)e KX , 



(5.137) 



where k is a positive real number. 




Figure 5.11: The functions F\(x) = 8(x)e KX and Fzix) = —6(—x)e H 
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We have 



\ 9(x)e~ KX \e~ lkx dx = , (5.138) 

oo 1 J ik-lK 

/°° r i 11 

J_ 9( _^} e -.^ = __ (5,39) 

Inverting the transforms gives 

1 f°° 1 

9(x)e~ KX = / e ikx dk, 

2wi J_ 00 k — in 

1 f°° 1 

-«-*>«" = « Lk^-/"* dk - (5 - i4o) 



These are important formulae in their own right, and you should take care 
to understand them. Now we apply them to evaluating the integral defining 

Rx- 

If we write A = /i + iu, we find 

2vr 1^ k - X \ -i6(S - aOe^-Oe-^-O i/ < 0, 15 J 

In each case, the resolvent is oc e* A:r away from £, and has jump of +i at 
i = ^ so as produce the delta function. It decays either to the right or to 
the left, depending on the sign of v. The Heaviside factor ensures that it is 
multiplied by zero on the exponentially growing side of e~ ra , so as to satisfy 
the requirement of square integrability. 

Taking the trace of this resolvent is a little problematic. We are to set x = 
£ and integrate — but what value do we associate with 0(0)? Remembering 
that Fourier transforms always give to the mean of the two values at a jump 
discontinuity, it seems reasonable to set 9(0) = \. With this definition, we 
have 

f |L, ImA>0, 
Tri? A = (5.142) 
{-jL, ImA<0. 

Our choice is therefore compatible with Tr R\ + i £ = up = L/2n. We have 
been lucky. The ambiguous expression 9(0) is not always safely evaluated as 
1/2. 
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5.6 Locality and the Gelfand-Dikii equation 

The answers to many quantum physics problems can be expressed either as 
sums over wavefunctions or as expressions involving Green functions. One 
of the advantages of writing the answer in terms of Green functions is that 
these typically depend only on the local properties of the differential operator 
whose inverse they are. This locality is in contrast to the individual wave- 
functions and their eigenvalues, both of which are sensitive to the distant 
boundaries. Since physics is usually local, it follows that the Green function 
provides a more efficient route to the answer. 

By the Green function being local we mean that its value for x, £ near 
some point can be computed in terms of the coefficients in the differential 
operator evaluated near this point. To illustrate this claim, consider the 
Green function G(x,£) for the Schrodinger operator — d\ + q(x) + A on the 
entire real line. We will show that there is a not exactly obvious (but easy 
to obtain once you know the trick) local gradient expansion for the diagonal 
elements D(x) = G(x,x). These elements are often all that is needed in 
physics. We begin by recalling that we can write 

G(x,£) oc u(x)v(£) 

where u(x), v(x) are solutions of {—dl + q(x) + X)y = satisfying suitable 
boundary conditions to the right and left respectively. We set D(x) = G(x, x) 
and differentiate three times with respect to x. We find 

d 3 x D(x) = u i3) v + 3u"v' + 3u'v" + uv {3) 

= (d x (q + X)u) v + 3(q + X)d x (uv) + (d x (q + X)v) u. 

Here, in passing from the first to second line, we have used the differential 
equation obeyed by u and v. We can re-express the second line as 

(qd x + d x q - l -dl)D{x) = -2Xd x D(x). (5.143) 

This relation is known as the Gelfand-Dikii equation. Using it we can find 
an expansion for the diagonal element D(x) in terms of q and its derivatives. 
We begin by observing that for q(x) = we know that D(x) = l/(2y/\). We 
therefore conjecture that we can expand 

{ ) 2VX V 2A + (2Xy + + ( ij (2A)« + J " 
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If we insert this expansion into (5.143) we see that we get the recurrence 
relation 

(qd x + d x q - l -dl)b n = 0A+1- (5-144) 

We can therefore find b n+ i from b n by differentiation followed by a single 
integration. Remarkably, d x b n+ \ is always the exact derivative of a polynomal 
in q and its derivatives. Further, the integration constants must be be zero 
so that we recover the q = result. If we carry out this process, we find 

h(x) = q(x), 

3q(x) 2 q"{x) 



box) 



5q(x) 3 5q'(x) 2 5q(x)q"(x) q^\x) 



(4) ( 



= ^4 — 1 7 2 — 2 * 

35g(x) 35 q(x) q'(x) 35q(x) q"{x) 21q"(x) 
HX) = ~8 I I + 8^ 

7g>(x)gW(x) 7g(x)gW(x) q^(x) 

and so on. (Note how the terms in the expansion are graded: Each b n 
is homogeneous in powers of q and its derivatives, provided we count two 
x derivatives as being worth one q(x).) Keeping a few terms in this series 
expansion can provide an effective approximation for G(x, x), but, in general, 
the series is not convergent, being only an asymptotic expansion for D(x). 

A similar strategy produces expansions for the diagonal element of the 
Green function of other one-dimensional differential operators. Such gradient 
expansions also exist in in higher dimensions but the higher-dimensional 
Seeley- coefficient functions are not as easy to compute. Gradient expansions 
for the off-diagonal elements also exist, but, again, they are harder to obtain. 



5.7 Further Exercises and problems 

Here are some further exercises that are intended to illustrate the material 
of this chapter: 

Exercise 5.1: Fredholm Alternative. A heavy elastic bar with uniform mass 
m per unit length lies almost horizontally. It is supported by a distribution of 
upward forces F{x). 
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y 



8 











'ft' 



F(x) 



Figure 5.12: Elastic bar 



The shape of the bar, y(x), can be found by minimizing the energy 



1 



U[y] = J <j -K{y"Y - (F(x) - mg)y \ dx. 



• Show that this minimization leads to the equation 

d 4 y 

Ly = K ~^~i = F(x) — mg, y" = y'" = at x = 0, L. 

• Show that the boundary conditions are such that the operator L is self- 
adjoint with respect to an inner product with weight function 1. 

• Find the zero modes which span the null space of L. 

• If there are n linearly independent zero modes, then the codimension of 
the range of L is also n. Using your explicit solutions from the previous 
part, find the conditions that must be obeyed by F(x) for a solution of 
Ly = F — mg to exist. What is the physical meaning of these conditions? 

• The solution to the equation and boundary conditions is not unique. Is 
this non-uniqueness physically reasonable? Explain. 

Exercise 5.2: Flexible rod again. A flexible rod is supported near its ends by 
means of knife edges that constrain its position, but not its slope or curvature. 
It is acted on by by a force F{x). 

The deflection of the rod is found by solving the the boundary value problem 
dx A 

We wish to find the Green function G(x, £) that facilitates the solution of this 
problem. 

a) If the differential operator and domain (boundary conditions) above is 
denoted by L, what is the operator and domain for L^? Is the problem 
self-adjoint? 



F(x), y(0) = y(l) = 0, y"(0) = y"(l) = 0. 
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y 




x=0 F(x) x =! 

Figure 5.13: Simply supported rod. 

b) Are there any zero-modes? Does F have to satisfy any conditions for the 
solution to exist? 

c) Write down the conditions, if any, obeyed by G(x,£) and its derivatives 
d x G(x, £), d xx G(x, £), d xxx G(x, £) at x = 0, x = £, and x = 1. 

d) Using the conditions above, find G(x,£). (This requires some boring 
algebra — but if you start from the "jump condition" and work down, 
it can be completed in under a page) 

e) Is your Green function symmetric (G(x,x) = G(£,x))? Is this in ac- 
cord with the self-adjointness or not of the problem? (You can use this 
property as a check of your algebra.) 

f ) Write down the integral giving the general solution of the boundary value 
problem. Assume, if necessary, that F(x) is in the range of the differential 
operator. Differentiate your answer and see if it does indeed satisfy the 
differential equation and boundary conditions. 



Exercise 5.3: Hot ring. The equation governing the steady state heat flow on 
thin ring of unit circumference is 

-/ = /, 0<x<l, y(0)=y(l), y'(0)=y'(l). 



a) This problem has a zero mode. Find the zero mode and the consequent 
condition on /(x) for a solution to exist. 

b) Verify that a suitable modified Green function for the problem is 

s(z,0 = ~(z-£) 2 -~|z-£l- 

You will need to verify that g(x,£) satisfies both the differential equation 
and the boundary conditions. 
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Exercise 5.4: By using the observation that the left hand side is 2tt times the 
eigenfunction expansion of a modified Green function G(x, 0) for L = —d^. on 
a circle of unit radius, show that 

°° p inx i _2 

E V = 2 (x - 7r)2 -y XG[0 ' 27r) - 

n=— oo 

The term with n = is to be omitted from the sum. 
Exercise 5.5: Seek a solution to the equation 

-0 = /(*). *e[0,l] 

with inhomogeneous boundary conditions y'(0) = -Fo, y'(l) = Observe 
that the corresponding homogeneous boundary condition problem has a zero 
mode. Therefore the solution, if one exists, cannot be unique. 

a) Show that there can be no solution to the differential equation and in- 
homogeneous boundary condition unless f(x) satisfies the condition 

/ f(x)dx = F -F 1 . (*) 
J o 

b) Let G(x,£) denote the modified Green function (5.57) 

G{x,i) = {\-^% 0<x<^ 
13-*+ 2". C<*<1, 

Use the Lagrange-identity method for inhomogeneous boundary condi- 
tions to deduce that if a solution exists then it necessarily obeys 



y(x) 



f 1 y(0 <% + /' G{£, x)f(0 dti + G(l,x)F 1 - G(0, x)F . 
Jo Jo 

c) By differentiating with respect to x, show that 

ytcntativc(x) = f G(£,x)f + G(l,x)F 1 - G(0,x)F + C, 
Jo 

where C is an arbitrary constant, obeys the boundary conditions. 

d) By differentiating a second time with respect to x, show that ^tentative (x) 
is a solution of the differential equation if, and only if, the condition * is 
satisfied. 
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Exercise 5.6: Lattice Green Functions . The k x k matrices 



/ 2 -1 








... 


° \ 




( 2 _1 








... 


° \ 


-1 2 


-1 





... 







-1 2 


-1 





... 





-1 


2 


-1 


... 





, T 2 = 


-1 


2 


-1 


... 





... 





-1 


2 -1 







... 





-1 


2 -1 





... 








-1 2 


-1 




... 








-1 2 


-1 


V ... 








-1 


2 J 




V ... 








-1 


1 / 



Ti = 



represent two discrete lattice approximations to —d% on a finite interval. 

a) What are the boundary conditions defining the domains of the corre- 
sponding continuum differential operators? [They are either Dirichlet 
(y = 0) or Neumann (y' = 0) boundary conditions.] Make sure you 
explain your reasoning. 

b) Verify that 

[T^] tJ = min(i,j)-_lL., 

[T" 1 ]^- = mm(i,j). 

c) Find the continuum Green functions for the boundary value problems 
approximated by the matrix operators. Compare each of the matrix 
inverses with its corresponding continuum Green function. Are they 
similar? 

Exercise 5.7: Eigenfunction expansion The resolvent (Green function) R\(x,£) 
(L — A)^ 1 can be expanded as 

y?n(V)^n(£) 



An 



An — A 



where f n (x) is the normalized eigenfunction corresponding to the eigenvalue 
A n . The resolvent therefore has a pole whenever A approaches X n . Consider 
the case ^ 

with boundary conditions y(0) = y(L) = 0. 
a) Show that 

R u) 2(x,^) = : -s'mujx sinu;(L — £), x < £, 



uj sin ojL 
1 

uj sinujL 



sinco(L — x) sinw^, £ < x 
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b) Confirm that becomes singular at exactly those values of uj 2 corre- 
sponding to eigenvalues uj 2 of — -^i- 

c) Find the associated eigenfunctions <p n (x) and, by taking the limit of 
R u 2 as uj 1 — ► oj 2 , confirm that the residue of the pole (the coefficient of 
l/(u) 2 — uj 2 )) is precisely the product of the normalized eigenfunctions 

Exercise 5.8: In this exercise we will investigate the self adjointness of the 
operator T = —id/dx on the interval [a, b] by using the resolvent operator 
R X = (T-XI)~ 1 . 

a) The integral kernel R\(x,^) is a Green function obeying 

-i-^--x] R x (x,0 = S(x-0. 
ox J 

Use standard methods to show that 

R x (x, ^) = UK x + i sgn (x - 0) e tXix ~°, 

where K\ is a number that depends on the boundary conditions imposed 
at the endpoints a, b, of the interval. 

b) If T is to be self-adjoint then the Green function must be Hermitian, i.e. 
R\( x -0 = [R\{£, x )}*- Find the condition on K\ for this to be true, and 
show that it implies that 

Rx(b,Q = i6x 
Rx(a,0 

where 9\ is some real angle. Deduce that the range of R\ is the set of 
functions 

V x = {y(x) : y(b) = e*%(a)}. 

Now the range of R\ is the domain of (T — XI), which should be same 
as the domain of T and therefore not depend on A. We therefore require 
that 9\ not depend on A. Deduce that T will be self-adjoint only for 
boundary conditions y{b) = e ie y{a) — i.e. for twisted periodic boundary 
conditions. 

c) Show that with the twisted periodic boundary conditions of part b), we 
have 

(X{b-a)-e" 
K x = - cot M -J— 

From this, show that R\(x,^) has simple poles at A = A n , where X n are 
the eigenvalues of T. 
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d) Compute the residue of the pole of R\(x,£) at the eigenvalue A n , and 
confirm that it is a product of the corresponding normalized eigenfunc- 
tions. 



Problem 5.9: Consider the one-dimensional Dirac Hamiltonian 

~ _ ( -id x m\-imi 
\mi + irri2 +id x 
= -ia 3 d x + mi(x)ai + m 2 (x)a2. 

Here m\(x), m^ix) are real functions, and the <7j are the Pauli matrices. H 
acts on a two-component "spinor" 



tpi(x) 
ip 2 (x) 



Impose self-adjoint boundary conditions 

at the ends of the interval [a, b]. Let ^-l(x) be a solution of H^f = obey- 
ing the boundary condition at x = a, and ^^(x) be a solution obeying the 
boundary condition at x = b. Define the "Wronskian" of these solutions to be 

a) Show that, for real A and the given boundary conditions, the Wronskian 
^(^l^r) is independent of position. Show also that W(^l,^l) = 
W(*r,*r) = 0. 

b) Show that the matrix-valued Green function G(x,£) obeying 

(H-\I)G(x,0 = IS(x-0, 
and the given boundary conditions has entries 



G a p(x,0 = < 



w* 



Observe that G a p{x } £) = Crg a (£, x), as befits the inverse of a self-adjoint 
operator. 



192 



CHAPTER 5. GREEN FUNCTIONS 



c) The Green function is discontinuous at x = £, but we can define a 
"position-diagonal" part by the taking the average 

Show that if we define the matrix g{x) by setting g(x) = G(x)d^, then 
trg(x) = and g 2 (x) = Show further that 

id x g=[g,K], (*) 

where K(x) = a% (XI — mi(x)?i — ttt^x)^). 

The equation (★) obtained in part (c) is the analogue of the Gelfand-Dikii 
equation for the Dirac Hamiltonian. It has applications in the theory of su- 
perconductivity, where (★) is known as the Eilenberger equation. 



Chapter 6 

Partial Differential Equations 



Most differential equations of physics involve quantities depending on both 
space and time. Inevitably they involve partial derivatives, and so are par- 
tial differential equations (PDE's). Although PDE's are inherently more 
complicated that ODE's, many of the ideas from the previous chapters — in 
particular the notion of self adjointness and the resulting completeness of the 
eigenf unctions — carry over to the partial differential operators that occur 
in these equations. 



6.1 Classification of PDE's 

We focus on second-order equations in two variables, such as the wave equa- 
tion 

|^-^ = /0M), (Hyperbolic) (6.1) 
Laplace or Poisson's equation 

£? + § = /(^), (Elliptic) (6.2) 
or Fourier's heat equation 

^ - «^ = f(x, t). (Parabolic) (6.3) 

What do the names hyperbolic, elliptic and parabolic mean? In high- 
school co-ordinate geometry we learned that a real quadratic curve 

ax 2 + 2bxy + cy 2 + fx + gy + h = (6.4) 
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represents a hyperbola, an ellipse or a parabola depending on whether the 
discriminant, ac — b 2 , is less than zero, greater than zero, or equal to zero, 
these being the conditions for the matrix 



a b 
b c 



to have signature (+, — ), (+, +) or (+, 0). 
By analogy, the equation 



(6.5) 



V) + 2& ( x ' V) Qrffy + c ( x ' V) ^2 + ( lower orders) = 0, (6.6) 



is said to be hyperbolic, elliptic, or parabolic at a point (x, y) if 

(ac-b 2 )\ {x , y) , (6.7) 



a(x,y) b(x,y) 
b(x,y) c(x,y) 



is less than, greater than, or equal to zero, respectively. This classification 
helps us understand what sort of initial or boundary data we need to specify 
the problem. 

There are three broad classes of boundary conditions: 

a) Dirichlet boundary conditions: The value of the dependent vari- 
able is specified on the boundary. 

b) Neumann boundary conditions: The normal derivative of the de- 
pendent variable is specified on the boundary. 

c) Cauchy boundary conditions: Both the value and the normal deriva- 
tive of the dependent variable are specified on the boundary. 

Less commonly met are Robin boundary conditions, where the value of a 
linear combination of the dependent variable and the normal derivative of 
the dependent variable is specified on the boundary. 

Cauchy boundary conditions are analogous to the initial conditions for a 
second-order ordinary differential equation. These are given at one end of 
the interval only. The other two classes of boundary condition are higher- 
dimensional analogues of the conditions we impose on an ODE at both ends 
of the interval. 

Each class of PDE's requires a different class of boundary conditions in 
order to have a unique, stable solution. 

1) Elliptic equations require either Dirichlet or Neumann boundary con- 
ditions on a closed boundary surrounding the region of interest. Other 
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boundary conditions are either insufficient to determine a unique solu- 
tion, overly restrictive, or lead to instabilities. 

2) Hyperbolic equations require Cauchy boundary conditions on a open 
surface. Other boundary conditions are either too restrictive for a 
solution to exist, or insufficient to determine a unique solution. 

3) Parabolic equations require Dirichlet or Neumann boundary condi- 
tions on a open surface. Other boundary conditions are too restrictive. 



6.2 Cauchy Data 

Given a second-order ordinary differential equation 

Poy" + PiV + P2V = f (6.8) 

with initial data y(a), y'{a) we can construct the solution incrementally. We 
take a step 5x = e and use the initial slope to find y(a + e) = y(a) + ey'(a). 
Next we find y"{a) from the differential equation 

y"(a) = -l{p iy '(a)+p 2 y(a) - f(a)), (6.9) 
Po 

and use it to obtain y'(a + e) — y'(a) + ey"(a). We now have initial data, 
y(a + s), y'(a + s), at the point a + e, and can play the same game to proceed 
to a + 2e, and onwards. 

, n 




Figure 6.1: The surface T on which we are given Cauchy Data. 

Suppose now that we have the analogous situation of a second order 
partial differential equation 

a ^( x )^a?7 + ( lowerorders ) = - ( 6 ' 10 ) 
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in R n . We are also given initial data on a surface, T, of co-dimension one in 
W n . 

At each point p on T we erect a basis n, ti, t 2 , . . . , t n _i, consisting of the 
normal to T and n — 1 tangent vectors. The information we have been given 
consists of the value of ip at every point p together with 

<9y? dcf n dip 

Tn = n ^ (6 - U) 

the normal derivative of <p at p. We want to know if this Cauchy data 
is sufficient to find the second derivative in the normal direction, and so 
construct similar Cauchy data on the adjacent surface T + en. If so, we can 
repeat the process and systematically propagate the solution forward through 
R n . 

From the given data, we can construct 

0V dcf d\ 



dndti 1 dx^dx 

d 2 ip dcf ^„ d 2 ip 



V 



tftf (6-12) 



dUdtj 1 3 dx»dx v 

but we do not yet have enough information to determine 

gV def nlln „ d 2 ip 
dndn dx^dx v ' 

Can we fill the data gap by using the differential equation (6.10)? Suppose 
that 

J^ =C + IlV * ( 6 ,4) 

where <pQ U is a guess that is consistent with (6.12), and $ is as yet unknown, 
and, because of the factor of n^n" ', does not affect the derivatives (6.12). We 
plug into 

o-uvixi)-^ — 7. V (known lower orders) = 0. (6.15) 

ox il ox u 

and get 

a^ra'Wfc + (known) = 0. (6.16) 
We can therefore find $ provided that 

a^n u ^ 0. (6.17) 
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If this expression is zero, we are stuck. It is like having po(x) = in an 
ordinary differential equation. On the other hand, knowing $ tells us the 
second normal derivative, and we can proceed to the adjacent surface where 
we play the same game once more. 

Definition: A characteristic surface is a surface X such that a ltv n lt n v = 
at all points on X. We can therefore propagate our data forward, provided 
that the initial-data surface T is nowhere tangent to a characteristic surface. 
In two dimensions the characteristic surfaces become one-dimensional curves. 
An equation in two dimensions is hyperbolic, parabolic, or elliptic at at a 
point (x, y) if it has two, one or zero characteristic curves through that point, 
respectively. 

Characteristics are both a curse and blessing. They are a barrier to 
Cauchy data, but, as we see in the next two subsections, they are also the 
curves along which information is transmitted. 

6.2.1 Characteristics and first-order equations 

Suppose we have a linear first-order partial differential equation 

Ou Ou 

We can write this in vector notation as (v • V)w + cu = f, where v is the 
vector field v = (a, b). If we define the flow of the vector field to be the 
family of parametrized curves x(t),y(t) satisfying 

dx , . dy . . 

— = a(x, y), — = b{x, y), (6.19) 

then the partial differential equation (6.18) reduces to an ordinary linear 
differential equation 

^ + c(t)u(t) = f(t) (6.20) 

along each flow line. Here, 

u(t) = u(x(t),y(t)), 
c(t) ee c(x(t),y(t)), 

fit) ee f(x(t),y(t)). (6.21) 
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Figure 6.2: Initial data curve T, and Row-line characteristics. 

Provided that a(x, y) and b(x, y) are never simultaneously zero, there will be 
one flow-line curve passing through each point in M 2 . If we have been given 
the initial value of u on a curve V that is nowhere tangent to any of these flow 
lines then we can propagate this data forward along the flow by solving (6.20). 
On the other hand, if the curve V does become tangent to one of the flow 
lines at some point then the data will generally be inconsistent with (6.18) 
at that point, and no solution can exist. The flow lines therefore play a role 
analagous to the characteristics of a second-order partial differential equation, 
and are therefore also called characteristics. The trick of reducing the partial 
differential equation to a collection of ordinary differential equations along 
each of its flow lines is called the method of characteristics. 

Exercise 6. 1 : Show that the general solution to the equation 

dip dip 

d^-dy-- {x - y)iP = ° 

is 

<p(x,y) = e~ xy f(x + y), 
where / is an arbitrary function. 



6.2.2 Second-order hyperbolic equations 

Consider a second-order equation containing the operator 

D = «(,, tf )£ + a(«,! ( ) g ^ + c(«,lflJ (6.22) 
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We can always factorize 

aX 2 + 2bXY + cY 2 = (aX + (3Y)(-fX + 5Y), (6.23) 
and from this obtain 

d 2 nh d 2 d 2 ( d a d \ ( d . d \ , 

a 7T2 + 2b Tnr + C 7T2 = "TT+^a - n^+^TT +lower ' 
ax 2 axay ay^ \ ax oy J \ ox oy J 

= (fir + 5 tt ) ( a ir + ^tt J + lower - 

\ ox oy J \ ox oy J 

(6.24) 

Here "lower" refers to terms containing only first order derivatives such as 

/<9 7 \ d n fdS\ d 



dx J dx' \dy J dy : 
A necessary condition, however, for the coefficients a, /3, 7, 5 to be rea/ is that 

ac — b 2 = afl'-fS — -(a5 + (3 / -f) 2 

= ~{a8- h) 2 <0. (6.25) 

A factorization of the leading terms in the second-order operator D as the 
product of two real first-order differential operators therefore requires that 
D be hyperbolic or parabolic. It is easy to see that this is also a sufficient 
condition for such a real factorization. For the rest of this section we assume 
that the equation is hyperbolic, and so 

ac-b 2 = — (aS-firf < 0. (6.26) 

With this condition, the two families of flow curves defined by 

C 1 : ^ = <*(*,!/), § = £(*,!/), (6-27) 

and 

C 2 : § = 7(*,V), § = ( 6 - 28 ) 
are distinct, and are the characteristics of D. 
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A hyperbolic second-order differential equation Du = can therefore be 
written in either of two ways: 

or 

-'l + 4) c,2 + F2 = ' (6 ' 30) 



where 



du r du 
ox dy 
du „du 

U 2 = a— + [3—, 6.31 
ox oy 

and F 12 contain only du/dx and du/dy. Given suitable Cauchy data, we 
can solve the two first-order partial differential equations by the method 
of characteristics described in the previous subsection, and so find Ui(x,y) 
and U 2 (x,y). Because the hyperbolicity condition (6.26) guarantees that the 
determinant 

7 5 



a (3 



7/5 — a5 



is not zero, we can solve (6.31) and so extract from £/ lj2 the individual deriva- 
tives du/dx and du/dy. From these derivatives and the initial values of u, 
we can determine u(x,y). 

6.3 Wave Equation 

The wave equation provides the paradigm for hyperbolic equations that can 
be solved by the method of characteristics. 

6.3.1 d'Alembert's Solution 

Let (fi(x,t) obey the wave equation 



(6.32) 



6.3. WAVE EQUATION 
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We use the method of characteristics to propagate Cauchy data tp(x, 0) = 
(po(x) and ip(x,0) = Vq(x), given on the curve r = {i6l,t = 0}, forward 
in time. 

We begin by factoring the wave equation as 

o = (^L - I<?V\ = (JL + 1^ (!?£. - 1^ (6 ™ 

\8x 2 c 2 dt 2 J \dx cdtj \dx c dt J ' 1 ' ' 

Thus, 

l + iD^.TO-O, (6.34) 



where 



= ✓ = ^=^=-f- (6-35) 

or c c at 



The quantity [/ — V is therefore constant along the characteristic curves 

x — ct = const. (6.36) 
Writing the linear factors in the reverse order yields the equation 

This implies that U + V is constant along the characteristics 

x + ct = const. (6.38) 
Putting these two facts together tells us that 

V(x,t') = ±[V(x,t') + U(x,t')] + ±[V(x,t')-U(x,t')} 

= - [V(x + ct', 0) + U(x + ct', 0)] + -[V(x - ct', 0) - U(x - ct' , 0)]. 

(6.39) 

The value of the variable V at the point (x, t') has therefore been computed 
in terms of the values of U and V on the initial curve T. After changing 
variables from t' to £ = x ± ct' as appropriate, we can integrate up to find 
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that 



tp(x,t) = ip{x,0) + c V{x,v)dz 

Jo 

= <p(x,0) + - / <S(£,0)d£+- 



x—ct 



2c 



x+ct 



x—ct 



1 1 r x + c ~t 

- &(x + ct, 0) + <p{x - ct, 0)} + - J ^ <p(Z, 0) dtl. 



(6.40) 



This result 



cp(x, t) = i {v? (^ + ct) + v?o(a; - ct)} + ^~ 
2 2c 



x+ct 



v (Od£ (6.41) 



x—ct 



is usually known as d'Alembert's solution of the wave equation. It was actu- 
ally obtained first by Euler in 1748. 




x-ct=const. 



x-ct 



x+ct=const. 



x+ct 



Figure 6.3: Range of Cauchy data influencing <p(x,t). 



The value of (p at x, t, is determined by only a finite interval of the initial 
Cauchy data. In more generality, f(x,t) depends only on what happens in 
the past light-cone of the point, which is bounded by pair of characteristic 
curves. This is illustrated in figure 6.3 

D'Alembert and Euler squabbled over whether ip and Vq had to be twice 
different iable for the solution (6.41) to make sense. Euler wished to apply 
(6.41) to a plucked string, which has a discontinuous slope at the plucked 
point, but d'Alembert argued that the wave equation, with its second deriva- 
tive, could not be applied in this case. This was a dispute that could not be 
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resolved (in Euler's favour) until the advent of the theory of distributions. It 
highlights an important difference between ordinary and partial differential 
equations: an ODE with smooth coefficients has smooth solutions; a PDE 
with with smooth coefficients can admit discontinuous or even distributional 
solutions. 

An alternative route to d'Alembert's solution uses a method that applies 
most effectively to PDE's with constant coefficients. We first seek a general 
solution to the PDE involving two arbitrary functions. Begin with a change 
of variables. Let 

£ = x + ct, 

r] = x-ct. (6.42) 
be light-cone co-ordinates. In terms of them, we have 

x = \(Z + V), 

t = ^(Z-V)- (6-43) 



Now, 
Similarly 



d_ _ dxd_ md_ _ 1 ( d_ ld_ . 
7)1 ~ ~iXdx~ + d^di 2 cdt) ' [) ' " ' 



d _ 1 / d Id 
dn 2 \dx c dt 



(6.45) 



Thus 

/ d 2 _ _ fd_ ld_\ fd_ _ ld_\ _ 

\dx 2 c 2 dt 2 J \dx cdt J \dx cdt J dC,di]' 

The characteristics of the equation 



d^dr] 



(6.47) 



are £ = const, or n = const. There are two characteristics curves through 
each point, so the equation is still hyperbolic. 

With light-cone coordinates it is easy to see that a solution to 

d 2 1 d 2 \ d 2 ip 

-»^W = -4£ = (6-48) 



dx 2 c 2 dt 2 J d£dr] 
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is 

<p(x, t) = /(O + g{rj) = f(x + ct) + g(x - ct). (6.49) 

It is this this expression that was obtained by d'Alembert (1746). 

Following Euler, we use d'Alembert's general solution to propagate the 
Cauchy data ip(x,0) = p>o(x) and ip(x,0) = vo(x) by using this information 
to determine the functions / and g. We have 

f(x)+g(x) = <p (x), 
c(f(x)-g'(x)) = v (x). (6.50) 

Integration of the second line with respect to x gives 

f(x)-g(x) = - /% (0^ + A (6.51) 

c Jo 

where A is an unknown (but irrelevant) constant. We can now solve for / 
and g, and find 

fix) = \Mx) + ^J v (S)dt+±A, 

9{x) = \<p*(x) - ^J\(t) * - \± (6-52) 

and so 

1 1 r x + ct 

<p(x,t) = -{(p (x + ct)+(p (x-ct)} + — vo{0dt (6-53) 

2 2c J x _ ct 

The unknown constant A has disappeared in the end result, and again we 
find "d'Alembert's" solution. 

Exercise 6.2: Show that when the operator D in a constant-coefficient second- 
order PDE Dip = is reducible, meaning that it can be factored into two 
distinct first-order factors D = P1P2, where 

9 a 9 
Pi = ai-z- + Pi-z- +7i> 
ox ay 

then the general solution to Dp = can be written as p> = <p\ + $2, where 
P\4>i = 0, P2<t>2 = 0. Hence, or otherwise, show that the general solution to 
the equation 

dV + 2 ^V _ dp _ 2 dp = Q 
dxdy dy 2 dx dy 
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is 



y) = f(2x -y) + e y g(x), 



where /, g, are arbitrary functions. 

Exercise 6.3: Show that when the constant-coefficient operator D is of the 



with a / 0, then the general solution to Dip = is given by <p = <pi + xfo, 
where P4>i,2 = 0. (If a = and 0^=0, then ip = <$>\ + yfe-) 

6.3.2 Fourier's Solution 

In 1755 Daniel Bernoulli proposed solving for the motion of a finite length L 
of transversely vibrating string by setting 



but he did not know how to find the coefficients A n (and perhaps did not 
care that his cosine time dependence restricted his solution to the intial 
condition y(x, 0) = 0). Bernoulli's idea was dismissed out of hand by Euler 
and d'Alembert as being too restrictive. They simply refused to believe that 
(almost) any chosen function could be represented by a trigonometric series 
expansion. It was only fifty years later, in a series of papers starting in 
1807, that Joseph Fourier showed how to compute the A n and insisted that 
indeed "any" function could be expanded in this way. Mathematicians have 
expended much effort in investigating the extent to which Fourier's claim is 
true. 

We now try our hand at Bernoulli's game. Because we are solving the 
wave equation on the infinite line, we seek a solution as a Fourier integral. 
A sufficiently general form is 



form 





(6.54) 



/°° dk 
— {a(k)e 



+ a*(k)e 



—ikx+iiOfrt 



(6.55) 



where u>k = c\k\ is the positive root of u 2 = c 2 k 2 . The terms being summed 
by the integral are each individually of the form f(x — ct) or f(x + ct), and so 
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tp(x, t) is indeed a solution of the wave equation. The positive- root convention 
means that positive k corresponds to right-going waves, and negative k to 
left-going waves. 

We find the amplitudes a(k) by fitting to the Fourier transforms 

/oo 
(p(x,t = 0)e~ ikx dx, 
-oo 

/oo 
ip(x,t = 0)e~ ikx dx, (6.56) 
■oo 



of the Cauchy data. Comparing 

dk 



<p(x,t = Q) = I —^(k)e ikx 
oo 2vr 



/ 

J — c 

/°° dk 

- X (ky kx , (6.57) 

with (6.55) shows that 

= a(k) + a*(-k), 

X (k) = iu k (a*(-k) -a(fc)). (6.58) 

Solving, we find 

a(k) = \ (* (k) + J- X (fc)) , 

a*(fc) = U*(-k) - -x(-k)). (6.59) 

The accumulated wisdom of two hundred years of research on Fourier 
series and Fourier integrals shows that, when appropriately interpreted, this 
solution is equivalent to d'Alembert's. 

6.3.3 Causal Green Function 

We now add a source term: 

1 d 2 (p d 2 (p 



c 2 dt 2 dx 2 



q(x,t). (6.60) 
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We solve this equation by finding a Green function such that 

{j*W " I?) t; e ' r) = 5{x ~ m ~ r) - (6 ' 61) 

If the only waves in the system are those produced by the source, we should 
demand that the Green function be causal, in that G(x, t; £, r) = if t < r. 



t t 




Figure 6.4: Support of G(x, t; £, r) for fixed £, r, or the "domain of influence. " 



To construct the causal Green function, we integrate the equation over 
an infinitesimal time interval from r — e to r + e and so find Cauchy data 



G(x, t + e; £, r) = 0, 
—G(x,t + s;^t) = c 2 5(x-£). 

We insert this data into d'Alembert's solution to get 

"X+c(t— r) 



(6.62) 



G(x,t;£,r) = 9{t-r) 



5(C - OdC 



z — c(t— r) 



0(t - r) {^(x - £ + c(* - r)) - fl(a; - £ - c(t - r)) } . 

(6.63) 



We can now use the Green function to write the solution to the inhomo- 
geneous problem as 



ip(x,t) 



J J G(x,t;Z,T)q(Z,T)dTdt. (6.64) 
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The step-function form of G(x, t; £, r) allows us to obtain 



<f(x,t) 



dr 

oo J x—c(t—r) 

[[ g(£,r) drd^ 



where the domain of integration Q is shown in figure 6.5. 
t x 




x+c(t-x) 



(6.65) 



Figure 6.5: The region Q, or the "domain of dependence." 

We can write the causal Green function in the form of Fourier's solution 
of the wave equation. We claim that 



G(x,t;£,T) 



dui 
2^ 



00 dk 



e ik(x— Cj e -iu{t-r) 

c 2 k 2 — (lj + is) 2 



(6.66) 



where the is plays the same role in enforcing causality as it does for the 
harmonic oscillator in one dimension. This is only to be expected. If we 
decompose a vibrating string into normal modes, then each mode is an in- 
dependent oscillator with io\ = c 2 k 2 , and the Green function for the PDE is 
simply the sum of the ODE Green functions for each k mode. To confirm our 
claim, we exploit our previous results for the single-oscillator Green function 
to evaluate the integral over oo, and we find 

/°° dk 1 
— e ikx —sm{\k\ct). (6.67) 
-oo 2vr c\k\ 
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Despite the factor of 1/|A;|, there is no singularity at k = 0, so no is is 
needed to make the integral over k well defined. We can do the k integral 
by recognizing that the integrand is nothing but the Fourier representation, 
| sin ak, of a square-wave pulse. We end up with 



G{x,t;0,0) = 6{t)-{6{x + ct) - 6{x - ct)} 



(6.68) 



the same expression as from our direct construction. We can also write 

c f 00 dk ( % 
2 27 1 W\ 



G(x,t;0,0) 



{ 



jikx—ic\k\t 



-ikx+ic\k\t 



} , t>0, (6.69) 



which is in explicit Fourier-solution form with a(k) = ic/2\k\. 
Illustration: Radiation Damping. Figure 6.6 shows bead of mass M that 
slides without friction on the y axis. The bead is attached to an infinite 
string which is initially undisturbed and lying along the x axis. The string has 
tension T, and a density p, so the speed of waves on the string is c = \jT~j p. 
We show that either d'Alembert or Fourier can be used to compute the effect 
of the string on the motion of the bead. 

We first use d'Alembert 's general solution to show that wave energy emit- 
ted by the moving bead gives rise to an effective viscous damping force on 
it. 



T 



Figure 6.6: A bead connected to a string. 



The string tension acting on the on the bead leads to the equation of 
motion Mv = Ty'(0,t), and from the condition of no incoming waves we 
know that 

y(x,t)=y(x-ct). (6.70) 

Thus y'(0,t) = —y(0,t)/c. But the bead is attached to the string, so v (t) = 
y(0,t), and therefore 




(6.71) 
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The emitted radiation therefore generates a velocity-dependent drag force 
with friction coefficient 77 = T/c. 

We need an infinitely long string for (6.71) to be true for all time. If 
the string had a finite length L, then, after a period of 2L/c, energy will be 
reflected back to the bead and this will complicate matters. 




<£(x) 



<f (x) 





Figure 6.7: The function 4>o(x) and its derivative. 

We now show that Fourier's mode-decomposition of the string motion, 
combined with the Caldeira-Leggett analysis of chapter 5, yields the same 
expression for the radiation damping as the d'Alembert solution. Our bead- 
string contraption has Lagrangian 



y[2/(0,t)] 2 



V[y(0,t)} 



P -y 2 

2 y 



—y 



dx. 



(6.72) 



Here, V[y] is some potential energy for the bead. 

To deal with the motion of the bead, we introduce a function <po{x) such 
that 0o(O) = 1 and (po(x) decreases rapidly to zero as x increases (see figure 



6.7. We therefore have 



5(x). We expand y(x,t) in terms of 4>o(x) 



and the normal modes of a string with fixed ends as 



y{x,t) = y(0,t)<f>, 



o(x) q n (t) J —sin k n x. (6.73) 

n=l 



Here k n L = rnr. Because y(0, t)<fto{x) describes the motion of only an in- 
finitesimal length of string, y(0,t) makes a negligeable contribution to the 
string kinetic energy, but it provides a linear coupling of the bead to the 
string normal modes, q n (t), through the Ty' 2 /2 term. Inserting the mode 
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expansion into the Lagrangian, and after about half a page of arithmetic, we 
end up with 



l = ^[y(0)] 2 -v[y(0)]+y(0)j:fn q n+Y: [ \& )~£ ( |) y(o) 2 , 

(6.74) 



n=l n=l v 7 n=l 



where u n = ck n , and 




(6.75) 



This is exactly the Caldeira-Leggett Lagrangian — including their frequency- 
shift counter-term that reflects that fact that a static displacement of an 
infinite string results in no additional force on the bead. 1 When L becomes 
large, the eigenvalue density of states 

p(u) = y^J(u;-u; n ) (6.76) 

n 

becomes 

p(u) = — . (6.77) 

TTC 

The Caldeira-Leggett spectral function 
is therefore 

T , , 71 2T 2 k 2 1 L fT\ tnnns 

where we have used c = \/T/ p. Comparing with Caldeira-Leggett 's J(u) = 
Tjcu, we see that the effective viscosity is given by i] = T/c, as before. The 
necessity of having an infinitely long string here translates into the require- 
ment that we must have a continuum of oscillator modes. It is only after the 
sum over discrete modes uJi is replaced by an integral over the continuum of 
cj's that no energy is ever returned to the system being damped. 



1 For a finite length of string that is fixed at the far end, the string tension does add 
7}Ty(0) 2 /L to the static potential. In the mode expansion, this additional restoring force 
arises from the first term of -<j>' (x) w l/L+(2/L) J2n=i cosk n x in ±Ty(0) 2 /Of,) 2 dx - The 
subsequent terms provide the Caldeira-Leggett counter-term. The first-term contribution 
has been omitted in (6.74) as being unimportant for large L. 
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For our bead and string, the mode-expansion approach is more com- 
plicated than d'Alembert's. In the important problem of the drag forces 
induced by the emission of radiation from an accelerated charged particle, 
however, the mode-expansion method leads to an informative resolution 2 of 
the pathologies of the Abraham-Lorentz equation, 

M(v- T v)=F«„ r-f^JL (6.80) 
which is plagued by runaway, or apparently acausal, solutions. 



6.3.4 Odd vs. Even Dimensions 

Consider the wave equation for sound in the three dimensions. We have a 
velocity potential which obeys the wave equation 

^0 9V (P± _ _ Q , 6gl x 

dx 2 dy 2 dz 2 c 2 dt 2 

and from which the velocity, density, and pressure fluctuations can be ex- 
tracted as 

vi = V0, 

Pi = 

Pi = c 2 pi. (6.82) 

In three dimensions, and considering only spherically symmetric waves, 
the wave equation becomes 

«9 2 (r0) ld 2 {r<P) 

^^~?^ 2_ -°' (6 ' 83) 

with solution 

0(M ) = i/(^) + (6 . 8 4) 

Consider what happens if we put a point volume source at the origin (the 
sudden conversion of a negligeable volume of solid explosive to a large volume 



2 G. W. Ford, R. F. O'Connell, Phys. Lett. A 157 (1991) 217. 
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of hot gas, for example). Let the rate at which volume is being intruded be 
q. The gas velocity very close to the origin will be 



v(r, t) 



Anr 2 



Matching this to an outgoing wave gives 

*) - Vl{r , t) = a * = -L f ( t -L)-Lf( t -L). 



dr r 2 



c/ rc 



(6.85) 



(6.86) 



Close to the origin, in the near field, the term oc f/r 2 will dominate, and so 



(6.87) 



Further away, in the far field or radiation field, only the second term will 
survive, and so 



«i = ^«--/' It-- ■ (6.88) 
or rc V cJ 

The far-field velocity-pulse profile v\ is therefore the derivative of the near- 
field V\ pulse profile. 




x 



Near field 



v or P 



Far field 



Figure 6.8: Three-dimensional blast wave. 



The pressure pulse 

Pl = -p o = £- r q (t - T -) (6.89) 

is also of this form. Thus, a sudden localized expansion of gas produces an 
outgoing pressure pulse which is first positive and then negative. 
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This phenomenon can be seen in (old, we hope) news footage of bomb 
blasts in tropical regions. A spherical vapour condensation wave can been 
seen spreading out from the explosion. The condensation cloud is caused by 
the air cooling below the dew-point in the low-pressure region which tails the 
over-pressure blast. 

Now consider what happens if we have a sheet of explosive, the simultane- 
ous detonation of every part of which gives us a one-dimensional plane-wave 
pulse. We can obtain the plane wave by adding up the individual spherical 
waves from each point on the sheet. 




Figure 6.9: Sheet-source geometry. 



Using the notation defined in figure 6.9, we have 

/ Vx 2 + s 2 



(x,t) = 2tt 



v^T 



-J It- 



sds 



(6.90) 



with f(t) = —q(t)/ATC, where now q is the rate at which volume is being 
intruded per unit area of the sheet. We can write this as 



2 *L f V- o 

/t—x/c 
/(r) dr, 
-oo 



dVx 2 + s 2 , 
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t—x/c 



g(r) dr. 



(6.91) 



In the second line we have defined r = t — \Jx 2 + s 2 /c, which, inter alia, 
interchanged the role of the upper and lower limits on the integral. 

Thus, v\ = (p'(x,t) = |g(t — x/c). Since the near field motion produced 
by the intruding gas is Vi(r) = \q(t), the far-field displacement exactly re- 
produces the initial motion, suitably delayed of course. (The factor 1/2 is 
because half the intruded volume goes towards producing a pulse in the neg- 
ative direction.) 

In three dimensions, the far-field motion is the first derivative of the near- 
field motion. In one dimension, the far-field motion is exactly the same as 
the near-field motion. In two dimensions the far-field motion should there- 
fore be the half-derivative of the near-field motion — but how do you half- 
differentiate a function? An answer is suggested by the theory of Laplace 
transformations as 



±) 3 F(t) d = f -±= 
dt w ^ 



F(t) 



-dr. 



(6.92) 



Let us now repeat the explosive sheet calculation for an exploding wire. 




Figure 6.10: Line-source geometry. 



Using the geometry shown in figure 6.10, we have 

r dr 



ds = d (^/r 2 — 



X' 



(6.93) 
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and combining the contributions of the two parts of the wire that are the 
same distance from p, we can write 



(p(x,t) 



00 1 ,/ r\ 2rdr 
-J [t-- 



cJ — ./• 



f[*--. 



2 _ rrl 

dr 



cJ y/r 2 — x 1 



(6.94) 



with f(t) = — q(t)/AiT, where now q is the volume intruded per unit length. 
We may approximate r 2 — x 2 ~ 2x(r — x) for the near parts of the wire where 
r m x, since these make the dominant contribution to the integral. We also 
set t = t — r /c, and then have 



<j>{x,t) 



2c r^~ x l^ 



f(r) 



dr 



1 /2c f^ x/c) 



x) — CT 



dr 



2vr V x " ' ' 7 y/(t-x/c) 

The far-field velocity is the x gradient of this, 

1 l~2c r^- x / c ) 



(6.95) 



T 



vi(r,t) 



2txc V x 



dr 



(6.96) 



and is therefore proportional to the 1/2-derivative of q(t — r/c) 



Near field 




Figure 6.11: In two dimensions the far-Geld pulse has a long tail. 



A plot of near field and far field motions in figure 6.11 shows how the 
far-field pulse never completely dies away to zero. This long tail means that 
one cannot use digital signalling in two dimensions. 



6.4. HEAT EQUATION 



217 



Moral Tale: One of our colleagues was performing numerical work on earth- 
quake propagation. The source of his waves was a long deep linear fault, 
so he used the two-dimensional wave equation. Not wanting to be troubled 
by the actual creation of the wave pulse, he took as initial data an outgoing 
finite-width pulse. After a short propagation time his numerical solution ap- 
peared to misbehave. New pulses were being emitted from the fault long after 
the initial one. He wasted several months in vain attempt to improve the 
stability of his code before he realized that what he was seeing was real. The 
lack of a long tail on his pulse meant that it could not have been created by 
a briefly-active line source. The new "unphysical" waves were a consequence 
of the source striving to suppress the long tail of the initial pulse. Moral: 
Always check that a solution of the form you seek actually exists before you 
waste your time trying to compute it. 

Exercise 6.4: Use the calculus of improper integrals to show that, provided 
F(— oo) = 0, we have 

* ( 1 /' = 1 /' (0.97) 

This means that 

6.4 Heat Equation 

Fourier's heat equation 



d(f> d 4> 



(6.99) 



dt dx 2 

is the archetypal parabolic equation. It often comes with initial data <f>(x, t = 0), 
but this is not Cauchy data, as the curve t = const, is a characteristic. 
The heat equation is also known as the diffusion equation. 

6.4.1 Heat Kernel 

If we Fourier transform the initial data 

/°° dk ~ 
-<f>(k)e lkx , (6.100) 
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and write 



dk ~ , . ;/i 



o(,:t) = I —^{k,ty k \ (6.101) 



we can plug this into the heat equation and find that 



^ = - K k 2 <f). (6.102) 



Hence, 



I 



oo 

00 dk~ 



oo27T 



(f)(k,0)e ikx ~ Kk2t . (6.103) 



We may now express (f)(k,0) in terms of <f>(x, 0) and rearrange the order of 
integration to get 



<Kx,t) = r M^dt 

J -oo ^ \J -oo / 

= /^(/^ eifeM " Kfe2 *)^'°)^ 

G(x,Z,tm,0)d£, (6.104) 



where 



G( ,, { , () = £ = _L= exp {__!_,. _ {) .} . (6 ,05) 



Here, G(x, £, t) is the heat kernel. It represents the spreading of a unit blob 
of heat. 
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Figure 6.12: The heat kernel at three successive times. 

As the heat spreads, the total amount of heat, represented by the area 
under the curve in figure 6.12, remains constant: 



: cxp 



x — £) 2 > dx = 1. 



The heat kernel possesses a semigroup property 



Exercise: Prove this. 



G(x,ri,t 2 )G(ri,£,ti)dr). 



(6.106) 



(6.107) 



6.4.2 Causal Green Function 

Now we consider the inhomogeneous heat equation 

du d 2 u 

q{x,t), 



dt dx 2 

with initial data u(x, 0) = u (x). We define a Causal Green function by 
d d 2 



(6.108) 



dt dx 2 



G(x,t;S,T) = 5(x-£)5(t-T) 



(6.109) 
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and the requirement that G(x,t;£,r) = if t < r. Integrating the equation 
from t = r — e to t = t + e tells us that 

G(x,T + e;S,T)=8(x-S). (6.110) 

Taking this delta function as initial data <f>(x, t — r) and inserting into (6.104) 
we read off 

G(«, t; { ,r) = Ht - r )7J ±=e. P {- w L 7 .( x - . (6.111) 

We apply this Green function to the solution of a problem involving both 
a heat source and initial data given at t = on the entire real line. We 
exploit a variant of the Lagrange-identity method we used for solving one- 
dimensional ODE's with inhomogeneous boundary conditions. Let 

d d 2 

D *> s « " a?- < 6 ' 112 ' 

and observe that its formal adjoint, 

= (6U3) 

is a "backward" heat-equation operator. The corresponding "backward" 
Green function 

GHx - i; { ' T) = 0(T ~ f) vW^T exp {-W^) (x ~ °1 (6 - 114) 

obeys 

D^G^x, t; r) = 5(x - £)8(t - r), (6.115) 

with adjoint boundary conditions. These make G^ anti-causal, in that G\t — r) 
vanishes when t > r. Now we make use of the two-dimensional Lagrange 
identity 

J dx J dt{u{x,t)Dl t G\x,t-^r)- (D Xit u(x,t)^tf{x,t;Z,T)} 

/oo poo 
dx{u(x,0)G^(x,0;£,T)} - / dx {u^x^G^x.T-i.T)} . (6. 
■oo J —oo 
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Assume that (£, r) lies within the region of integration. Then the left hand 
side is equal to 



/OO pi 
dx / dt{q(x,t)Gt(x,t;S,r)}. (6.117) 
oo J0 

On the right hand side, the second integral vanishes because is zero on 
t = T. Thus, 



/OO fl poo 

dx / dt\q(x,t)G\x,t;Z,T)\+ / {u(x,0)G\x,0;£,T)}dx 
-oo Jo ^ > J -oo 

(6.118) 

Rewriting this by using 

Gi(xM,T) = G(£,T;x,t), (6.119) 
and relabeling x <-> £ and t <-> r, we have 



/oo ^oo 
G(x,t;C,0)u (OdC+ / / G(x,t;e,r)g(e,r)^r. (6.120) 
oo J —oo J 



Note how the effects of any heat source q(x, t) active prior to the initial-data 
epoch at t — have been subsumed into the evolution of the initial data. 



6.4.3 Duhamel's Principle 

Often, the temperature of the spatial boundary of a region is specified in 
addition to the initial data. Dealing with this type of problem leads us to a 
new strategy. 

Suppose we are required to solve 

du d 2 u 

m- K W (6 ' 121) 

for the semi-infinite rod shown in figure 6.13. We are given a specified tem- 
perature, u(0, t) = h(t), at the end x — 0, and for all other points x > we 
are given an initial condition u(x, 0) = 0. 
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u(x,t) 



h(t) 



Figure 6.13: Semi- infinite rod heated at one end. 

We begin by finding a solution w(x, t) that satisfies the heat equation with 
w(0,t) = 1 and initial data w(x,0) = 0, x > 0. This solution is constructed 
in problem 6.14, and is 



(6.122) 



w = 9{t) {I - erf 



2y/i 



Here erf (x) is the error function 

2 [ x _ 2 
erf(x) = —= / e z dz. 

V n Jo 



Figure 6.14: Error function. 



If we were given 



h(t) =h o 0(t-t o ), 
then the desired solution would be 



(6.123) 



which has the properties that erf(0) = and erf(x) — * 1 as x —>■ oo. See 
figure 6.14. 



erf(x) 



(6.124) 



u(x, t) = h w(x, t — t ). 



(6.125) 
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For a sum 

h(t) = J2h n 0(t-t n ), (6.126) 

n 

the principle of superposition (i.e. the linearity of the problem) tell us that 
the solution is the corresponding sum 

u(x, t) — h n w(x, t — t n ). (6.127) 

n 

We therefore decompose h(t) into a sum of step functions 

h{t) = h(0)+ [ !l{T)dT 

Jo 

poo 

= h(0)+ / 9(t - r)h(r) dr. (6.128) 
Jo 



It is should now be clear that 



u(x,t) = J w(x,t — t)1i(t) dr + h(0)w(x,t) 
Jo 

= ~ J (J^w(x,t - T^j h(r) dr 

= jT* (^-w(x,t-r)^h(r)dr. (6.129) 

This is called Duhamel's solution, and the trick of expressing the data as a 
sum of Heaviside step functions is called Duhamel's principle. 

We do not need to be as clever as Duhamel. We could have obtained 
this result by using the method of images to find a suitable causal Green 
function for the half line, and then using the same Lagrange-identity method 
as before. 

6.5 Potential Theory 

The study of boundary-value problems involving the Laplacian is usually 
known as '"Potential Theory." We seek solutions to these problems in some 
region Q, whose boundary we denote by the symbol dfl. 
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Poisson's equation, — V 2 x(r) = /(r), rGfi, and the Laplace equation to 
which it reduces when /(r) = 0, come along with various boundary condi- 
tions, of which the commonest are 

X — g(r) on <9f2, (Dirichlet) 
(n-V)x = ^(r) on dfl. (Neumann) (6.130) 

A function for which V 2 x = in some region Q is said to be harmonic there. 

6.5.1 Uniqueness and existence of solutions 

We begin by observing that we need to be a little more precise about what 
it means for a solution to "take" a given value on a boundary. If we ask for 
a solution to the problem V 2 y? = within Q = {(x,y) G R 2 : x 2 + y 2 < 1} 
and f = 1 on dfl, someone might claim that the function defined by setting 
<p{x,y) = for x 2 + y 2 < 1 and <p{x,y) = 1 for x 2 + y 2 = 1 does the job- 
but such a discontinuous "solution" is hardly what we had in mind when we 
stated the problem. We must interpret the phrase "takes a given value on the 
boundary" as meaning that the boundary data is the limit, as we approach 
the boundary, of the solution within Q. 

With this understanding, we assert that a function harmonic in a bounded 
subset Q of R n is uniquely determined by the values it takes on the boundary 
of Q. To see that this is so, suppose that tpi and y?2 both satisfy V 2 y? = in 
Q, and coincide on the boundary. Then x — ~ V^2 obeys V 2 x = in Q, 
and is zero on the boundary. Integrating by parts we find that 

/ |V X |W= / X (n-V)x^ = 0. (6.131) 
Jn Jan 

Here dS is the element of area on the boundary and n the outward-directed 
normal. Now, because the second derivatives exist, the partial derivatives 
entering into Vx must be continuous, and so the vanishing of integral of 
|Vx| 2 tells us that Vx is zero everywhere within Q. This means that x is 
constant — and because it is zero on the boundary it is zero everywhere. 

An almost identical argument shows that if Q is a bounded connected 
region, and if tpi and ip2 both satisfy V 2 y? = within Q and take the same 
values of (n • V)y? on the boundary, then (p 1 = tp 2 + const. We have therefore 
shown that, if it exists, the solutions of the Dirichlet boundary value problem 
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is unique, and the solution of the Neumann problem is unique up to the 
addition of an arbitrary constant. 

In the Neumann case, with boundary condition (n ■ V)y? = g(r), and 
integration by parts gives 

f V 2 ipd n r= [ (n-V)<pdS= [ gdS, (6.132) 
Jn Jan Jan 

and so the boundary data g(r) must satisfy J dn gdS = if a solution to 
V 2 y? = is to exist. This is an example of the Fredhom alternative that 
relates the existence of a non-trivial null space to constraints on the source 
terms. For the inhomogeneous equation — V 2 y? = /, the Fredholm constraint 
becomes 

/ gdS+ [ fd n r = 0. (6.133) 
Jan Jn 

Given that we have satisfied any Fredholm constraint, do solutions to the 
Dirichlet and Neumann problem always exist? That solutions should exist is 
suggested by physics: the Dirichlet problem corresponds to an electrostatic 
problem with specified boundary potentials and the Neumann problem cor- 
responds to finding the electric potential within a resistive material with 
prescribed current sources on the boundary. The Fredholm constraint says 
that if we drive current into the material, we must must let it out somewhere. 
Surely solutions always exist to these physics problems? In the Dirichlet case 
we can even make a mathematically plausible argument for existence: We 
observe that the boundary-value problem 

VV = 0, rGfl 

if = f, re dn (6.134) 

is solved by taking ip to be the x that minimizes the functional 

J[X\ = [ \V X \ 2 d n r (6.135) 
Jn 

over the set of continuously differentiable functions taking the given boundary 
values. Since J[x] is positive, and hence bounded below, it seems intuitively 
obvious that there must be some function x f° r which J[x\ is a minimum. 
The appeal of this Dirichlet principle argument led even Riemann astray. 
The fallacy was exposed by Weierstrass who provided counterexamples. 
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Consider, for example, the problem of finding a function tp(x,y) obeying 
VV — within the punctured disc D' = {(x,y) G 1R 2 : < x 2 + y 2 < 1} 
with boundary data ip(x,y) = 1 on the outer boundary at x 2 + y 2 = 1 and 
(p(0, 0) = on the inner boundary at the origin. We substitute the trial 
functions 



to find J[xa] — 27ra. This number can be made as small as we like, and so 
the infimum of the functional J[x] is zero. But if there is a minimizing tp, 
then J[(p] = implies that (p is a constant, and a constant cannot satisfy the 
boundary conditions. 

An analogous problem reveals itself in three dimensions when the bound- 
ary of Q has a sharp re-entrant spike that is held at a different potential from 
the rest of the boundary. In this case we can again find a sequence of trial 
functions x{ r ) f° r which J[x] becomes arbitrarily small, but the sequence of 
X's has no limit satisfying the boundary conditions. The physics argument 
also fails: if we tried to create a physical realization of this situation, the 
electric field would become infinite near the spike, and the charge would leak 
off and and thwart our attempts to establish the potential difference. For 
reasonably smooth boundaries, however, a minimizing function does exist. 

The Dirichlet-Poisson problem 



Xa(x,y) = (x 2 + y 2 ) a , a>0, 
all of which satisfy the boundary data, into the positive functional 



(6.136) 




(6.137) 



VV(r) 
tp(r) 



/(r), rGfl, 
g(r), r e dtt, 



(6.138) 



and the Neumann-Poisson problem 



(n- VMr) 



/(r), xett, 
g(r), x E dil 



supplemented with the Fredholm constraint 



/ fd n r+ [ gdS = 
Jn Jdn 



(6.139) 
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also have solutions when dQ is reasonably smooth. For the Neumann-Poisson 
problem, with the Fredholm constraint as stated, the region Q must be con- 
nected, but its boundary need not be. For example, Q can be the region 
between two nested spherical shells. 

Exercise 6.5: Why did we insist that the region 17 be connected in our dis- 
cussion of the Neumann problem? (Hint: how must we modify the Fredholm 
constraint when Q consists of two or more disconnected regions?) 

Exercise 6.6: Neumann variational principles. Let f! be a bounded and con- 
nected three-dimensional region with a smooth boundary. Given a function / 
defined on and such that J n f d 3 r = 0, define the functional 

JM= X0 |VX|2 "*/} d\. 
Suppose that ip is a solution of the Neumann problem 

-VV(r) = /(r), re!], 
(n • V)p(r) =0, r G dQ. 

Show that 

J[X\ = JM+ ( l\V( X -<p)\ 2 d 3 r>J[<p}=- I l -\V^d\ = - 1 - j <pfd 3 r. 

Deduce that if is determined, up to the addition of a constant, as the function 
that minimizes J[x] over the space of all continuously differentiable y (and 
not just over functions satisfying the Neumann boundary condition.) 

Similarly, for g a function defined on the boundary dQ and such that f Qn gdS = 
0, set 

K[x\ = I hv X \ 2 d 3 r- [ X gdS. 
Jn z Jan 

Now suppose that 4> is a solution of the Neumann problem 

-V 2 0(r) = 0, r en, 
(n-V)eXr) = g(r), r £ dft. 

Show that 



K[ X ]=K[ct>}+[ hv( x -<P)\ 2 d 3 r>K[<J ) ] = - [ ±| V0| 2 d 3 r = -\ [ fadS. 
Jn z Jn 1 1 Jan 
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Deduce that (ft is determined up to a constant as the function that minimizes 
K[x] over the space of all continuously differentiable x (and, again, not just 
over functions satisfying the Neumann boundary condition.) 

Show that when / and g fail to satisfy the integral conditions required for 
the existence of the Neumann solution, the corresponding functionals are not 
bounded below, and so no minimizing function can exist. 

Exercise 6. 7: Helmholtz decomposition Let O be a bounded connected three- 
dimensional region with smooth boundary dCl. 

a) Cite the conditions for the existence of a solution to a suitable Neumann 
problem to show that if u is a smooth vector field defined in f2, then 
there exist a unique solenoidal (i.e having zero divergence) vector field 
v with v • n = on the boundary d£l, and a unique (up to the addition 
of a constant) scalar field (f> such that 

u = v + V(f>. 

Here n is the outward normal to the (assumed smooth) bounding surface 
of fl 

b) In many cases (but not always) we can write a solenoidal vector field v 
as v = curlw. Again by appealing to the conditions for existence and 
uniqueness of a Neumann problem solution, show that if we can write 
v = curlw, then w is not unique, but we can always make it unique by 
demanding that it obey the conditions div w = and w • n = 0. 

c) Appeal to the Helmholtz decomposition of part a) with u — > (v • V)v to 
show that in the Euler equation 

<9v 

— + (v ■ V)v = - VP, v ■ n = on dQ, 

governing the motion of an incompressible (divv = 0) fluid the instan- 
taneous flow field v(x, y, z,t) uniquely determines dv/dt, and hence the 
time evolution of the flow. (This observation provides the basis of prac- 
tical algorithms for computing incompressible flows.) 

We can always write the solenoidal field as v = curlw + h, where h obeys 
V 2 h = with suitable boundary conditions. See exercise 6.16. 

6.5.2 Separation of Variables 

Cartesian Coordinates 

When the region of interest is a square or a rectangle, we can solve Laplace 
boundary problems by separating the Laplace operator in cartesian co-ordinates. 
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Let 

d 2 (p d 2 (r> 

a? + a^ = ' < 6 ' 140 > 

and write 

<p = X(x)Y(y), (6.141) 

so that 

1 d 2 X ld 2 Y . . 

I8? + FF = °' (6 ' 142) 

Since the first term is a function of x only, and the second of y only, both 
must be constants and the sum of these constants must be zero. Therefore 

X dx 2 
1 d 2 Y 

= k 2 , (6.143) 



Y dy 2 



or, equivalently 



-K^ + k 2 X = 0, 

OX A 

d 2 Y 

-k 2 Y = 0. (6.144) 

dy 2 

The number that we have, for later convenience, written as k 2 is called a 
separation constant. The solutions are X = e ±lkx and Y = e ±ky . Thus 

(p = e ±ite e ± *w, (6.145) 

or a sum of such terms where the allowed fc's are determined by the boundary 
conditions. 

How do we know that the separated form X(x)Y(y) captures all possible 
solutions? We can be confident that we have them all if we can use the sep- 
arated solutions to solve boundary-value problems with arbitrary boundary 
data. 
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y 



L 



L 



Figure 6.15: Square region. 



We can use our separated solutions to construct the unique harmonic 
function taking given values on the sides a square of side L shown in figure 
6.15. To see how to do this, consider the four families of functions 



1 



L sinh mi 
2 1 
L sinh mi 
2 1 
L sinh nil 
2 1 
L sinh mi 



mix nny 
sm sinh 



mix 
sinh — — sin 



L ' 
mxy 



mix mr(L — y) 
sm smh . 



mi(L — x) nny 
sinh sin — — . 



(6.146) 



Each of these comprises solutions to V 2 y2 = 0. The family <p 1>n (x, y) has been 
constructed so that every member is zero on three sides of the square, but 
on the side y — L it becomes (pi >n (x,L) = 2 / L sm(nnx / L) . The (pi iTl (x,L) 
therefore constitute an complete orthonormal set in terms of which we can 
expand the boundary data on the side y = L. Similarly, the other other 
families are non-zero on only one side, and are complete there. Thus, any 
boundary data can be expanded in terms of these four function sets, and the 
solution to the boundary value problem is given by a sum 



(6.147) 



m=l n=l 
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The solution to V 2 y5 = in the unit square with (p = 1 on the side y — 1 
and zero on the other sides is, for example, 




Figure 6.16: Plot of first thiry terms in equation (6.148). 



For cubes, and higher dimensional hypercubes, we can use similar bound- 
ary expansions. For the unit cube in three dimensions we would use 



Pi,nm(x, V, x) = — — — , sin(mre) sm(rmiy) sinh (nzVn 2 + m 2 ) , 

smh (7rv™ + vnr) \ ' 

to expand the data on the face z — 1, together with five other solution 
families, one for each of the other five faces of the cube. 

If some of the boundaries are at infinity, we may need only need some of 
these functions. 

Example: Figure 6.17 shows three conducting sheets, each infinite in the z 
direction. The central one has width a, and is held at voltage Vq. The outer 
two extend to infinity also in the y direction, and are grounded. The resulting 
potential should tend to zero as |x|, \y\ — > oo. 
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Figure 6.17: Conducting sheets. 
The voltage in the x = plane is 

/°° dk 
-a(k)e-^ } (6.149) 
■oo ^ 

where 

/a/2 cy\r 
e ik v dy=—^ sin(jU/2). (6.150) 
■a/2 ^ 

Then, taking into account the boundary condition at large x, the solution to 
VV = is 

/°° dk 
—a{k)e- lky e- m . (6.151) 
-oo 2vr 

The evaluation of this integral, and finding the charge distribution on the 
sheets, is left as an exercise. 

The Cauchy Problem is Ill-posed 

Although the Laplace equation has no characteristics, the Cauchy data prob- 
lem is ill-posed, meaning that the solution is not a continuous function of the 
data. To see this, suppose we are given V 2 v? = with Cauchy data on y = 0: 

<p(x,0) = 0, 
dip 



dy 



= e sin kx. (6.152) 
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Then 

(f(x, y) = — sin(kx) sinh(%). (6.153) 

K 

Provided k is large enough — even if e is tiny — the exponential growth of the 
hyperbolic sine will make this arbitrarily large. Any infinitesimal uncertainty 
in the high frequency part of the initial data will be vastly amplified, and 
the solution, although formally correct, is useless in practice. 



Polar coordinates 

We can use the separation of variables method in polar coordinates. Here, 

V 2 x = ^ + -^ + -^X (6.154) 
dr 2 r dr r 2 dd 2 

Set 

x (r,8)=R(r)G(8). (6.155) 

Then V 2 x = implies 

r 2 (d 2 R ldR\ ld 2 B 
= - ^r + -^~ + 



R V dr 2 r dr J dd 2 

m 2 - m 2 , (6.156) 

where in the second line we have written the separation constant as m 2 . 
Therefore, 

^ + m 2 6 = 0, (6.157) 

implying that = e im6 , where m must be an integer if is to be single- 
valued, and 

r 2 ^ + r^-m 2 R = 0, (6.158) 
dr 2 dr 

whose solutions are R = r ±m when m^O, and 1 or In r when m — 0. The 
general solution is therefore a sum of these 

X = A + B lnr + ^(A m rH + B m r-^)e rm6 . (6.159) 

The singular terms, lnr and r~' m ', are not solutions at the origin, and should 
be omitted when that point is part of the region where V 2 x = 0. 
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Example: Dirichlet problem in the interior of the unit circle. Solve V 2 x = 
in fi = {r 6 I 2 : |r| < 1} with X = f(0) on dQ = {|r| = 1}. 




Figure 6.18: Dirichlet problem in the unit circle. 



We expand 



X (r.9) = A ^ rMe 



\m\ imd 



m=—oo 



and read off the coefficients from the boundary data as 



A m = ^- e- ime 'f(9')d6'. 
271" Jo 



Thus, 



X 



2tt 



We can sum the geometric series 



E 



r \m\ e im(8-e') 



m=— oo 



S{ff)dff 



E 



r \m\ e im(d-6') 



I re -i(e-e') 
I _ re i(0-0') i _ re -i(e-8') 

1-r 2 
1 - 2rcos(# - Y) +r 2 ' 



(6.160) 



(6.161) 



(6.162) 



(6.163) 



Therefore, 
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This expression is known as the Poisson kernel formula. Observe how the 
integrand sharpens towards a delta function as r approaches unity, and so 
ensures that the limiting value of x( r > is consistent with the boundary 
data. 

If we set r = in the Poisson formula, we find 



We deduce that if V 2 x = in some domain then the value of x at a point 
in the domain is the average of its values on any circle centred on the chosen 
point and lying wholly in the domain. 

This average-value property means that x can have no local maxima or 
minima within Q. The same result holds in IR n , and a formal theorem to this 
effect can be proved: 

Theorem (The mean-value theorem for harmonic functions): If \ is harmonic 
(V 2 x = 0) within the bounded (open, connected) domain Q e W 1 , and is 
continuous on its closure Q, and if m < x < M on dQ, then m < x < M 
within Q — unless, that is, m = M, when x = m is constant. 

Pie-shaped regions 




(6.165) 




Figure 6.19: A pie-shaped region of opening angle a. 



Electrostatics problems involving regions with corners can often be under- 
stood by solving Laplace's equation within a pie-shaped region. 
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Figure 6.19 shows a pie-shaped region of opening angle a and radius R. 
If the boundary value of the potential is zero on the wedge and non-zero on 
the boundary arc, we can seek solutions as a sum of r, 9 separated terms 

<p(r,6) = J2 a nr nw/a sin I . (6.166) 

n=l ^ ' 

Here the trigonometric function is not 2tt periodic, but instead has been 
constructed so as to make (p vanish at 9 = and 9 = a. These solutions 
show that close to the edge of a conducting wedge of external opening angle 
a, the surface charge density a usually varies as a(r) oc r"^ 1 . 

If we have non-zero boundary data on the edge of the wedge at 9 — a, 
but have <p = on the edge at 9 = and on the curved arc r = R, then the 
solutions can be expressed as a continuous sum of r, 9 separated terms 

yy 1 ! 2i J K J \\RJ \rJ J sinh(z/a) ' 

= a(u) sm[u\n(r/R)} sm ^ u9 \ dv , ( 6 .i 67 ) 

J sinh(i/a) 

The Mellin sine transformation can be used to computing the coefficient 
function a(v). This transformation lets us write 

2 f°° 

f(r) = - F(is)sm(v\nr)dv, < r < 1, (6.168) 

JO 

where 

f 1 dr 
F(u)= sin(i/ In r)f(r) —. (6.169) 
Jo r 

The Mellin sine transformation is a disguised version of the Fourier sine 

transform of functions on [0, oo). We simply map the positive x axis onto 

the interval (0, 1] by the change of variables x — — lnr. 

Despite its complexity when expressed in terms of these formulae, the 

simple solution tp{r, 9) = a9 is often the physically relevant one when the two 

sides of the wedge are held at different potentials and the potential is allowed 

to vary on the curved arc. 

Example: Consider a pie-shaped region of opening angle 7r and radius R = 
oo. This region can be considered to be the upper half-plane. Suppose that 
we are told that the positive x axis is held at potential +1/2 and the negative 
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x axis is at potential —1/2, and are required to find the potential for positive 
y. If we separate Laplace's equation in cartesian co-ordinates and match to 
the boundary data on the rr-axes, we end up with 

1 f°° 1 

<Pxy(x,y) = - -e~ ky sin(kx) dk. 
71 Jo k 

On the other hand, the function 

Vr9 (r,9) = -(n/2-6) 

7T 

satisfies both Laplace's equation and the boundary data. At this point we 
ought to worry that we do not have enough data to determine the solution 
uniquely — nothing was said in the statement of the problem about the 
behavior of (p on the boundary arc at infinity — but a little effort shows that 



i r°° 1 

TT Jo k 



e ky sin( kx)dk = -tan 1 f - J , y > 0, 
TT \yj 



-(tt/2-0), 

71 



(6.170) 



and so the two expressions for ip(x,y) are equal. 
6.5.3 Eigenfunction Expansions 

Elliptic operators are the natural analogues of the one-dimensional linear 
differential operators we studied in earlier chapters. 

The operator L = —V 2 is formally self-adjoint with respect to the inner 
product 

(0,X)= [[ <t>*xdxdy. (6.171) 



This property follows from Green's identity 

/ / (0*(-V 2 x) - (-V 2 0)* X } dxdy = [ {0*(-V X ) - (-V0)* X } • nds 
J J vi Jan 

(6.172) 

where dfl is the boundary of the region Q and n is the outward normal on 
the boundary. 



238 



CHAPTER 6. PARTIAL DIFFERENTIAL EQUATIONS 



The method of separation of variables also allows us to solve eigenvalue 
problems involving the Laplace operator. For example, the Dirichlet eigen- 
value problem requires us to find the eigenfunctions and eigenvalues of the 
operator 

L = -V 2 , £>(L) = {0G L 2 [fi] : = 0, on <9fi}. (6.173) 

Suppose Q is the rectangle < x < L x , < y < L y . The normalized 
eigenfunctions are 



<Pn, m {x, y) = \ j -j—r- sin — - sin — — ) , (6.174) 



with eigenvalues 



n 2 Tr 2 \ ( m 2 n 2 



y 



The eigenfunctions are orthonormal, 

/^M-U-, (6-176) 
and complete. Thus, any function in L 2 [Q] can be expanded as 

oo 

f(x,y)= A nm(f>n,m(x,y), (6.177) 

m,n=l 

where 

A nm / / 0n,m 

(x,y)f(x,y) dxdy. (6.178) 

We can find a complete set of eigenfunctions in product form whenever we 
can separate the Laplace operator in a system of co-ordinates £j such that the 
boundary becomes £j = const. Completeness in the multidimensional space 
is then guaranteed by the completeness of the eigenfunctions of each one- 
dimensional differential operator. For other than rectangular co-ordinates, 
however, the separated eigenfunctions are not elementary functions. 

The Laplacian has a complete set of Dirichlet eigenfunctions in any region, 
but in general these eigenfunctions cannot be written as separated products 
of one-dimensional functions. 
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6.5.4 Green Functions 

Once we know the eigenfunctions (p n and eigenvalues A n for —V 2 in a region 
Q, we can write down the Green function as 



3(r,r') = ^^n(rK(r'). 

n Xn 

For example, the Green function for the Laplacian in the entire R n is given 
by the sum over eigenfunctions 

/d n h p* k -(r-r') 
, , 9 . (6.179) 

(2tt)" k 2 K ' 

Thus 

±!L e *<r-r') = nr _^ (6180) 

We can evaluate the integral for any n by using Schwinger's trick to turn the 
integrand into a Gaussian: 



Jo J (2tt)» 

sj (2tt) 



= — / 

2 n 7r n/2 J q 

—r (- - 1) 

)n n n/2 V2 / 



2\ l-n/2 



|r — r" 



2"7r n / 2 V2 / \ 4 

I / i \ n-2 



(n-2)S' n _ 1 V|r-r'| 
Here, is Euler's gamma function: 



(6.181) 



oo 

z-1 -t 



I» = / dtt x - Y e-\ (6.182) 
Jo 

and 

27T n/2 

s «- = rR2) < 6 ' 183 ' 
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is the surface area of the n- dimensional unit ball. 
For three dimensions we find 

S(r,r') = -L^- n = 3. (6.184) 
An |r — r | 

In two dimensions the Fourier integral is divergent for small k. We may 
control this divergence by using dimensional regularization. We pretend that 
n is a continuous variable and use 

T(x) = -T(x + 1) (6.185) 

Ob 

together with 

a x = e alnx = l + a\nx + --- (6.186) 
to to examine the behaviour of g(r, r') near n = 2: 

9 (tJ) = -L (1 - (n/2 - 1) lnfrlr - rf ) + O [(n - 2) 2 ] ) 

= — ( — 21n |r - r'l - lnvr - 7 H j . (6.187) 

4vr Vri/2 - 1 11 / 

Here 7 = — r'(l) = .57721 ... is the Euler-Mascheroni constant. Although 
the pole 1/ (n — 2) blows up at n = 2, it is independent of position. We simply 
absorb it, and the — ln7r — 7, into an undetermined additive constant. Once 
we have done this, the limit n — > 2 can be taken and we find 

g(r,r') = -^-ln|r-r'| + const., n = 2. (6.188) 

27T 

The constant does not affect the Green-function property, so we can chose 
any convenient value for it. 

Although we have managed to sweep the small-A; divergence of the Fourier 
integral under a rug, the hidden infinity still has the capacity to cause prob- 
lems. The Green function in M 3 allows us to to solve for <p(r) in the equation 

-VV = q(r), 
with the boundary condition <p(r) — > as |r| — > 00, as 

y?(r) = y p(r, r')g(r') d 3 r. 
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In two dimensions, however we try to adjust the arbitrary constant in (6.188), 
the divergence of the logarithm at infinity means that there can be no solution 
to the corresponding boundary-value problem unless f q(r) d 3 r = 0. This is 
not a Fredholm-alternative constraint because once the constraint is satisfied 
the solution is unique. The two-dimensional problem is therefore patholog- 
ical from the viewpoint of Fredholm theory. This pathology is of the same 
character as the non-existence of solutions to the three-dimensional Dirichlet 
boundary-value problem with boundary spikes. The Fredholm alternative 
applies, in general, only to operators a discrete spectrum. 



Exercise 6.8: Evaluate our formula for the M n Laplace Green function, 

5(r ' r/) = (n-2) < S n _ 1 |r-r'|-- 2 

with S n _i = 2W 2 /r(n/2), for the case n = 1. Show that the resulting 
expression for g(x, x') is not divergent, and obeys 



d 2 

dx 2 = 6<yX ~ X '^' 



Our formula therefore makes sense as a Green function — even though the 
original integral (6.179) is linearly divergent at k = 0! We must defer an 
explanation of this miracle until we discuss analytic continuation in the context 
of complex analysis. 
(Hint: recall that T(l/2) = 0F) 



6.5.5 Boundary-value problems 

We now look at how the Green function can be used to solve the interior 
Dirichlet boundary-value problem in regions where the method of separation 
of variables is not available. Figure 6.20 shows a bounded region Q possessing 
a smooth boundary dQ. 
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Figure 6.20: Interior Dirichlet problem. 

We wish to solve — V 2 y? = q(r) for reO and with ip(r) = /(r) for r G dQ. 
Suppose we have found a Green function that obeys 

-V^(r,r') = <T(r-r'), r,r e SI, c/(r,r')=0, reffi. (6.189) 

We first show that g(r, r') = g(r', r) by the same methods we used for one- 
dimensional self-adjoint operators. Next we follow the strategy that we used 
for one-dimensional inhomogeneous differential equations: we use Lagrange's 
identity (in this context called Green's theorem) to write 

/ dV{^(r,r')V^(r)-^(r)V^(r,r')} 
Jn 

= [ dS r - {g(r, r')VMr) ~ V?(r)V r s(r, r ')}, (6.190) 
Jan 

where dS r = ndS r , with n the outward normal to dQ at the point r. The 
left hand side is 

L.H.S. = I d n r{-g{r,r')q(r)+(p(r)8 n {r-r')}, 
Jn 

= - d n rg{r,r')q(r) + (p(r'), 
Jn 

= - / d n rg(r',r)q(r) + <p{r'). (6.191) 
Jn 

On the right hand side, the boundary condition on g(r,r') makes the first 
term zero, so 

R.H.S = - / dS r f(r)(n ■ V r )g(r, r'). (6.192) 
Jan 
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Therefore, 

cp(r') = [ g(r',v)q(v)d n r- [ /(r)(n • V r )<?(r, r') dS T . (6.193) 
Jn Jan 

In the language of chapter 3, the first term is a particular integral and the 
second (the boundary integral term) is the complementary function. 

Exercise 6.9: Assume that the boundary is a smooth surface, Show that the 
limit of tp(r') as r' approaches the boundary is indeed consistent with the 
boundary data /(r'). (Hint: When r, r' are very close to it, the boundary can 
be approximated by a straight line segment, and so g(r, r') can be found by 
the method of images.) 



• r 




Figure 6.21: Exterior Dirichlet problem. 

A similar method works for the exterior Dirichlet problem shown in figure 
6.21. In this case we seek a Green function obeying 

-V^(r,r) = 5 n (r-r'), r, r G E n \ ft #(r,r')=0, r G <9ft. (6.194) 

(The notation IR n \ ft means the region outside ft.) We also impose a further 
boundary condition by requiring g(r,r'), and hence y?(r), to tend to zero as 
|r| — > oo. The final formula for <£>(r) is the same except for the region of 
integration and the sign of the boundary term. 

The hard part of both the interior and exterior problems is to find the 
Green function for the given domain. 
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Exercise 6.10: Suppose that (p(x, y) is harmonic in the half-plane y > 0, tends 
to zero as y — > oo, and takes the values f(x) on the boundary y = 0. Show 
that 



1 



y 



Deduce that the "energy" functional 



f(x') dx', y>0. 



can be expressed as 



dx 



y=o 



S[f] 



4tt 



oo roc 



— CO J —CO 



f(x)-f(xV 2 



x — X 



dx 1 dx. 



The non-local functional S[f] appears in the quantum version of the Caldeira- 
Leggett model. See also exercise 2.24. 



Method of Images 

When dQ is a sphere or a circle we can find the Dirichlet Green functions for 
the region Q by using the method of images. 




Figure 6.22: Points inverse with respect to a circle. 



Figure 6.22 shows a circle of radius R. Given a point B outside the circle, 
and a point X on the circle, we construct A inside and on the line OB, so 
that ZOBX = ZOXA. We now observe that AXOA is similar to ABOX, 
and so 

OA OX , 

OX = OB' < 6 - 195 > 
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Thus, OA x OB = (OX) 2 = R 2 . The points A and B are therefore mutually 
inverse with respect to the circle. In particular, the point A does not depend 
on which point X was chosen. 

Now let AX= rj, BX= ro and OB= B. Then, using similar triangles 
again, we have 

AX BX , 

(6.196) 



or 



OX OB' 
R B 



(6.197) 



and so 

1 f R\ 1 



Interpreting the figure as a slice through the centre of a sphere of radius R, 
we see that if we put a unit charge at B, then the insertion of an image charge 
of magnitude q = —R/B at A serves to the keep the entire surface of the 
sphere at zero potential. 

Thus, in three dimensions, and with Q the region exterior to the sphere, 
the Dirichlet Green function is 

9n(r, r B ) = ±- ( r-^— - ( ^ ) . (6.199) 

4vr \\r - r B | Vl r B|/ |r - r A \J 

In two dimensions, we find similarly that 

0n(r,r B ) = ~ (in |r - r B | - In |r - r A | - In (|r B |/i2)) , (6.200) 

has gn(r, r B ) = for r on the circle. Thus, this is the Dirichlet Green function 
for Q, the region exterior to the circle. 

We can use the same method to construct the interior Green functions 
for the sphere and circle. 



6.5.6 Kirchhoff vs. Huygens 

Even if we do not have a Green function tailored for the specific region in 
which were are interested, we can still use the whole-space Green function 
to convert the differential equation into an integral equation, and so make 
progress. An example of this technique is provided by Kirchhoff's partial 
justification of Huygens' construction. 
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The Green function G(r, r') for the elliptic Helmholtz equation 

(-V 2 + /t 2 )G(r, r') = <5 3 (r-r') (6.201) 

in R 3 is given by 

f d " k e "' (r " r,) 1 e-l-'l (6 202) 

J (2vr) 3 k 2 + K 2 47r|r-r'| ' 1 } 

Exercise 6.11: Perform the k integration and confirm this. 

For solutions of the wave equation with e~ lult time dependence, we want 
a Green function such that 

G(r,r') = <5 3 (r-r'), (6.203) 

and so we have to take k 2 negative. We therefore have two possible Green 
functions 

G±(t,t>)= 1 e^l—'l, (6.204) 
47r|r — r'| 

where k— \oo\/c. These correspond to taking the real part of k 2 negative, but 
giving it an infinitesimal imaginary part, as we did when discussing resolvent 
operators in chapter 5. If we want outgoing waves, we must take G = G + . 
Now suppose we want to solve 

(V 2 + k 2 )ip = (6.205) 

in an arbitrary region Q. As before, we use Green's theorem to write 

/ {G(r,r')(V 2 + A; 2 Mr)-^(r)(V 2 + A; 2 )G(r,r')} d n x 

= [ {G(r,r')VMr)-i;(r)V r G(r,r')}.dS r (6.206) 
Jan 

where dS r = ndS r , with n the outward normal to dQ at the point r. The 
left hand side is 

jf V(r)5"(r - r') d n x = { ^ (r,) ' ^ ^ % (6.207) 
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and so 

ij(r')= [ {G(r,r')(n-V x )if;(r)-if;(r)(n-Vr)G(r,r')}dS T , r'eQ. 
J an 

(6.208) 

This must not be thought of as solution to the wave equation in terms of an 
integral over the boundary, analogous to the solution (6.193) of the Dirichlet 
problem that we found in the last section. Here, unlike that earlier case, 
G(r, r') knows nothing of the boundary dfl, and so both terms in the surface 
integral contribute to if>. We therefore have a formula for ip(r) in the interior 
in terms of both Dirichlet and Neumann data on the boundary dfl, and 
giving both over-prescribes the problem. If we take arbitrary values for if) 
and (n • S/)if) on the boundary, and plug them into (6.208) so as to compute 
if>(r) within Q then there is no reason for the resulting ip(r) to reproduce, as r 
approaches the boundary, the values if) and (n- V)ip appearing in the integral. 
If we demand that the output ip(r) does reproduce the input boundary data, 
then this is equivalent to demanding that the boundary data come from a 
solution of the differential equation in a region encompassing Q. 




Figure 6.23: Huy gens' construction. 

The mathematical inconsistency of assuming arbitrary boundary data 
notwithstanding, this is exactly what we do when we follow Kirchhoff and 
use (6.208) to provide a justification of Huygens' construction as used in 
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optics. Consider the problem of a plane wave, ip = e , incident on a screen 
from the left and passing though the aperture labelled AB in figure 6.23. 

We take as the region Q everything to the right of the obstacle. The Kirch- 
hoff approximation consists of assuming that the values of ip and (n • 
on the surface AB are e %kx and —ike lkx , the same as they would be if the 
obstacle were not there, and that they are identically zero on all other parts 
of the boundary. In other words, we completely ignore any scattering by 
the material in which the aperture resides. We can then use our formula to 
estimate ip in the region to the right of the aperture. If we further set 



which is a good approximation provided we are more than a few wavelengths 
away from the aperture, we find 



Thus, each part of the wavefront on the surface AB acts as a source for the 
diffracted wave in Q. 

This result, although still an approximation, provides two substantial 
improvements to the naive form of Huygens' construction as presented in 
elementary courses: 

i) There is factor of (1 + cos6>) which suppresses backward propagating 
waves. The traditional exposition of Huygens construction takes no 
notice of which way the wave is going, and so provides no explanation 
as to why a wavefront does not act a source for a backward wave. 

ii) There is a factor of i^ 1 = e~ l7r//2 which corrects a 90° error in the phase 
made by the naive Huygens construction. For two-dimensional slit 
geometry we must use the more complicated two-dimensional Green 
function (it is a Bessel function), and this provides an e _i7r / 4 factor 
which corrects for the 45° phase error that is manifest in the Cornu 
spiral of Fresnel diffraction. 

For this reason the Kirchhoff approximation is widely used. 

Problem 6.12: Use the method of images to construct i) the Dirichlet, and 
ii) the Neumann, Green function for the region 0,, consisting of everything to 
the right of the screen. Use your Green functions to write the solution to the 




(6.209) 




(6.210) 
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diffraction problem in this region a) in terms of the values of -0 on the aperture 
surface AB, b) in terms of the values of (n • on the aperture surface. In 
each case, assume that the boundary data are identically zero on the dark side 
of the screen. Your expressions should coincide with the Rayleigh-Sommerfeld 
diffraction integrals of the first and second kind, respectively. 3 Explore the 
differences between the predictions of these two formula? and that of Kirchhoff 
for case of the diffraction of a plane wave incident on the aperture from the 
left. 



6.6 Further Exercises and problems 

Problem 6.13: Critical Mass. An infinite slab of fissile material has thickness 
L. The neutron density n(x) in the material obeys the equation 

dn ^d 2 n 

where n(x, t) is zero at the surface of the slab at x = 0, L. Here, D is the 
neutron diffusion constant, the term An describes the creation of new neutrons 
by induced fission, and the constant \x is the rate of production per unit volume 
of neutrons by spontaneous fission. 

a) Expand n(x, t) as a series, 

n(x,t) = ^2a m (t)(p m (x), 

m 

where the f m (x) are a complete set of functions you think suitable for 
solving the problem. 

b) Find an explicit expression for the coefficients a m (t) in terms of their 
intial values a m (0). 

c) Determine the critical thickness L cr i t above which the slab will explode. 

d) Assuming that L < L cr it, find the equilibrium distribution n cq (x) of 
neutrons in the slab. (You may either sum your series expansion to get an 
explicit closed-form answer, or use another (Green function?) method.) 

Problem 6.14: Semi-infinite Rod. Consider the heat equation 

d0 

— = DV 2 6, < x < oo, 
at 

with the temperature 6(x,t) obeying the initial condition 6(x, 0) = 6q f° r 
< x < oo, and the boundary condition 6(0, t) = 0. 



3 M. Born and E. Wolf Principles of Optics 7th (expanded) edition, section 8.11. 
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a) Show that the boundary condition at x = may be satisfied at all times 
by introducing a suitable mirror image of the initial data in the region 
— oo < x < 0, and then applying the heat kernel for the entire real 
line to this extended initial data. Show that the resulting solution of the 
semi-infinite rod problem can be expressed in terms of the error function 



erf (x) = f — = / e ^d£, 
Jo 



as 



6(x,t) = O erf 



b) Solve the same problem by using a Fourier integral expansion in terms 
of sin kx on the half- line < x < oo and obtaining the time evolution 
of the Fourier coefficients. Invert the transform and show that your 
answer reduces to that of part a). (Hint: replace the initial condition by 
9(x,0) = 9oe~ €X , so that the Fourier transform converges, and then take 
the limit e — ► at the end of your calculation.) 

Exercise 6.15: Seasonal Heat Waves. Suppose that the measured temperature 
of the air above the arctic permafrost at time t is expressed as a Fourier series 

oo 

9(t) = 9q + ^2 ®n cos nut, 

71=1 

where the period T = 2tt/lo is one year. Solve the heat equation for the soil 
temperature, 

89 d 2 9 
ot az l 

with this boundary condition, and find the temperature 9(z,t) at a depth z 
below the surface as a function of time. Observe that the sub-surface temper- 
ature fluctuates with the same period as that of the air, but with a phase lag 
that depends on the depth. Also observe that the longest-period temperature 
fluctuations penetrate the deepest into the ground. (Hint: for each Fourier 
component, write 9 as Re[^4 n (z) expinut], where A n is a complex function of 
z.) 

The next problem is an illustration of a Dirichlet principle. 

Exercise 6.16: Helmholtz-Hodge decomposition. Given a three-dimensional 
region O with smooth boundary <9fi, introduce the real Hilbert space L^ ec {Q) 
of finite-norm vector fields, with inner product 



(u, v) = / u • vd 3 x. 
Jn 
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Consider the spaces C = {v : v = V</>} and T = {v : v = curlw} consisting 
of vector fields in L 2 ec (f2) that can can be written as gradients and curls, 
respectively. (Strictly speaking, we should consider the completions of these 
spaces.) 

a) Show that if we demand that either (or both) of <f> and the tangential 
component of w vanish on d£l, then the two spaces C and T are mutually 
orthogonal with respect to the the _L 2 ec (f2) inner product. 

Let u 6 L 2 ec (f2). We will try to express u as the sum of a gradient and a curl 
by seeking to make the distance functional 



equal to zero. 

b) Show that if we find a w and <j> that minimize F u [(/>, w], then the residual 
vector field 



obeys curlh = and divh = 0, together with boundary conditions 
determined by the constraints imposed on <p and w: 

i) If (j> is unconstrained on d£l, but the tangential boundary component 
of w is required to vanish, then the component of h normal to the 
boundary must be zero. 

ii) If (f> = on d£l, but the tangential boundary component of w is 
unconstrained, then the tangential boundary component of h must 
be zero. 

iii) If cf> = on dQ and also the tangential boundary component of w is 
required to vanish, then h need satisfy no boundary condition. 

c) Assuming that we can find suitable minimizing <\> and w, deduce that 
under each of the three boundary conditions of the previous part, we 
have a Helmholtz-Hodge decomposition 



into unique parts that are mutually L 2 ec (f2) orthogonal. Observe that 
the residual vector field h is harmonic — i.e. it satisfies the equation 
V 2 h = 0, where 




u — V(p — curl w || 

/ |u — V(p — curl w| 2 d 3 x 

n 



h = u — V0 — curl w 



u = V0 + curl w + h 



V 2 h = V(div h) - curl (curl h) 



is the vector Laplacian acting on h. 
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If u is sufficiently smooth, there will exist <p and w that minimize the distance 
||u — — curl w 1 1 and satisfy the boundary conditions. Whether or not h is 
needed in the decomposition is another matter. It depends both on how we 
constrain (f> and w, and on the topology of Q. At issue is whether or not the 
boundary conditions imposed on h are sufficient to force it to be zero. If 
is the interior of a torus, for example, then h can be non-zero whenever its 
tangential component is unconstrained. 

The Helmholtz- Hodge decomposition is closely related to the vector-field 
eigenvalue problems commonly met with in electromagnetism or elasticity. 
The next few exercises lead up to this connection. 

Exercise 6.17: Self-adjointness and the vector Laplacian. Consider the vector 
Laplacian (defined in the previous problem) as a linear operator on the Hilbert 
space Ll cc (n) . 

a) Show that 

/ d 3 x{u- (V 2 v) - v- (V 2 u)} = / {(n- u)divv- (n • v)divu 
Jn Jan 

— u • (n x curlv) + v • (n x curlu)} dS 

b) Deduce from the identity in part a) that the domain of V 2 coincides 
with the domain of (V 2 )t, and hence the vector Laplacian defines a truly 
self-adjoint operator with a complete set of mutually orthogonal eigen- 
functions, when we take as boundary conditions one of the following: 

0) Dirichlet-Dirichlet: n • u = and n x u = on d£l, 

1) Dirichlet-Neumann: n • u = and n x curlu = on dfl, 

ii) Neumann-Dirichlet: divu = and n x u = on d£l, 

iii) Neumann-Neumann: divu = and n x curlu = on dd. 

c) Show that the more general Robin boundary conditions 

ot(n • u) + (3 div u = 0, 
A(n x u) + ^(n x curlu) = 0, 

where a (3, fx v can be position dependent, also give rise to a truly self- 
adjoint operator. 

Problem 6.18: Cavity electrodynamics and the Hodge-Weyl decomposition. 
Each of the self-adjoint boundary conditions in the previous problem gives 
rise to a complete set of mutually orthogonal vector eigenfunctions obeying 

-V 2 u„ = fc 2 u n . 
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For these eigenfunctions to describe the normal modes of the electric field E 
and the magnetic field B (which we identify with H as we will use units in 
which fio = eo = 1) within a cavity bounded by a perfect conductor, we need 
to additionally impose the Maxwell equations divB = divE = everywhere 
within and to satisfy the perfect-conductor boundary conditions n x E = 
n B = 0. 

a) For each eigenfunction u n corresponding to a non-zero eigenvalue & 2 , 
define 



v ™ = tj curl (curl u n ), w„ = --^V(drv u n ), 

so that u ra = v ra + w n . Show that v ra and w m are, if non-zero, each 
eigenfunctions of —V 2 with eigenvalue /c 2 . The vector eigenfunctions 
that are not in the null-space of V 2 can therefore be decomposed into 
their transverse (the v n , which obey divv ra = 0) and longitudinal (the 
w n , which obey curlw ra = 0) parts. However, it is not immediately clear 
what boundary conditions the v ra and w ra separately obey, 
b) The boundary-value problems of relevance to electromagnetism are: 



ii) 
iii) 



-V 2 h n = /c 2 h n , within U, 

n • h n = 0, n x curlh n = 0, on dQ; 

-V 2 e n = /c 2 e n , within f2, 

dive ra = 0, n x e„ = 0, on dft; 

-V 2 b n = /c 2 b n , within Q, 

divb n = 0, n x curlb n = 0, on d£l, 



These problems involve, respectively, the Dirichlet-Neumann, Neumann- 
Dirichlct, and Neumann-Neumann boundary conditions from the previ- 
ous problem. 

Show that the divergence-free transverse eigenfunctions 

H n = f ^2 curl (curll^), E n d = -4-curl (curle„), B n d = A curl (curlb n ), 

K n K n K n 

obey n • H n = n x E n = n x curlB Tt = on the boundary, and that from 
these and the eigenvalue equations we can deduce that n x curlH n = 
n-B n = n-curlE n = on the boundary. The perfect-conductor boundary 
conditions are therefore satisfied. 

Also show that the corresponding longitudinal eigenfunctions 

= f T2 V(divh n ), e n = -^V(dive n ), (3 n d = -^-V(divb„) 

K n K n K n 

obey the boundary conditions n • rj n = n x e n = n x (3 n = 0. 
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c) By considering the counter-example provided by a rectangular box, show 
that the Dirichlet-Dirichlet boundary condition is not compatible with a 
longitudinal+transverse decomposition. (A purely transverse wave inci- 
dent on such a boundary will, on reflection, acquire a longitudinal com- 
ponent.) 

d) Show that 



but that the v n and w n obtained from the Dirichlet-Dirichlet boundary 
condition u n 's are not in general orthogonal to each other. Use the 
continuity of the L^ ec (Q) inner product 



to show that this individual-eigenfunction orthogonality is retained by 
limits of sums of the eigenfunctions. Deduce that, for each of the bound- 
ary conditions i)-iii) (but not for the Dirichlet-Dirichlet case), we have 
the Hodge- Weyl decomposition of L^ ec (0) as the orthogonal direct sum 



where C, T are respectively the spaces of functions representable as in- 
finite sums of the longitudinal and transverse eigenfunctions, and Af is 
the finite-dimensional space of harmonic (nullspace) eigenfunctions. 

Complete sets of vector eigenfunctions for the interior of a rectangular box, 
and for each of the four sets of boundary conditions we have considered, can 
be found in Morse and Feshbach §13.1. 

Problem 6.19: Hodge-Weyl and Helmholtz-Hodge. In this exercise we consider 
the problem of what classes of vector-valued functions can be expanded in 
terms of the various families of eigenfunctions of the previous problem. It is 
tempting (but wrong) to think that we are restricted to expanding functions 
that obey the same boundary conditions as the eigenfunctions themselves. 
Thus, we might erroniously expect that the E n are good only for expanding 
functions whose divergence vanishes and have vanishing tangential boundary 
components, or that the rj n can expand out only curl- free vector fields with 
vanishing normal boundary component. That this supposition can be false 
was exposed in section 2.2.3, where we showed that functions that are zero at 
the endpoints of an interval can be used to expand out functions that are not 
zero there. The key point is that each of our four families of u„ constitute 




x n ^x =>- (x n ,y; 



■>-<x,y> 



L 2 vcc (n) =C@T@N, 
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a complete orthonormal set in L^ CC (Q), and can therefore be used expand 
any vector field. As a consequence, the infinite sum ^ a n B n 6 T can, for 
example, represent any vector- valued function u 6 Ly ec (f2) provided only that 
u possesses no component lying either in the subspace C of the longitudinal 
eigenfunctions (3 n , or in the nullspace N . 

a) Let T = < E„ > be space of functions representable as infinite sums of 
the E n . Show that 

<£„,>-*-= {u : curlu = within Q, n x u = on dQ}. 

Find the corresponding perpendicular spaces for each of the other eight 
orthogonal decomposition spaces. 

b) Exploit your knowledge of < E„ >- L acquired in part (a) to show that 
< E n > itself is the Hilbert space 

< E n >= {u : divu = within Q, no condition on Oft}. 

Similarly show that 

< e n > = {u : curlu = within 0, n x u = on d£l} , 

< Vn > = { u : curlu = within 0, no condition on <9S1}, 

< H„ > = {u : div u = within Q, n • u = on <9f2}, 

< (3 n > = {u : curlu = within 17, n x u = on d£l}, 

< B n > = {u : div u = within Q, n • u = on Oft}. 

c) Conclude from the previous part that any vector vector field u 6 L^ ec (Q) 
can be uniquely decomposed as the L^ ec (Q) orthogonal sum 

u = V(f) + curl w + h, 

where G C, curlw G T, and h G A/", under each of the following sets 
of conditions: 

i) The scalar <p is unrestricted, but w obeys n x w = on dd, and 
the harmonic h obeys n • h = on <9S7. (The condition on w makes 
curlw have vanishing normal boundary component.) 

ii) The scalar <p is zero on Oft, while w is unrestricted on dQ. The 
harmonic h obeys n x h = on dd. (The condition on <p makes 
have zero tangential boundary component.) 

iii) The scalar <p is zero on dQ, the vector w obeys n x w = on 
dtt, while the harmonic h requires no boundary condition. (The 
conditions on (j) and w make have zero tangential boundary 
component and curl w have vanishing normal boundary component.) 
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d) As an illustration of the practical distinctions between the decomposi- 
tions in part (c), take O to be the unit cube in M 3 , and u = (1,0,0) a 
constant vector field. Show that with conditions (i) we have u £ C, but 
for (ii) we have u G T, and for (iii) we have u € N. 

We see that the Hodge- Weyl decompositions of the eigenspaces correspond 
one-to-one with the Helmholtz-Hodge decompositions of problem 6.16. 



Chapter 7 



The Mathematics of Real 
Waves 

Waves are found everywhere in the physical world, but we often need more 
than the simple wave equation to understand them. The principal compli- 
cations are non-linearity and dispersion. In this chapter we will describe the 
mathematics lying behind some commonly observed, but still fascinating, 
phenomena. 

7.1 Dispersive waves 

In this section we will investigate the effects of dispersion, the dependence 
of the speed of propagation on the frequency of the wave. We will see that 
dispersion has a profound effect on the behaviour of a wave-packet. 

7.1.1 Ocean Waves 

The most commonly seen dispersive waves are those on the surface of water. 
Although often used to illustrate wave motion in class demonstrations, these 
waves are not as simple as they seem. 

In chapter one we derived the equations governing the motion of water 
with a free surface. Now we will solve these equations. Recall that we 
described the flow by introducing a velocity potential <fi such that, v = V0, 
and a variable h(x, t) which is the depth of the water at abscissa x. 
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Figure 7.1: Water with a free surface. 

Again looking back to chapter one, we see that the fluid motion is determined 
by imposing 

V 2 = O (7.1) 
everywhere in the bulk of the fluid, together with boundary conditions 

0, on y = 0, (7.2) 
0, on the free surface y — h, (7.3) 

0, on the free surface y — h. (7.4) 

Recall the physical interpretation of these equations: The vanishing of the 
Laplacian of the velocity potential simply means that the bulk flow is incom- 
pressible 

divv = V 2 = O. (7.5) 

The first two of the boundary conditions are also easy to interpret: The first 
says that no water escapes through the lower boundary at y — 0. The second, 
a form of Bernoulli's equation, asserts that the free surface is everywhere at 
constant (atmospheric) pressure. The remaining boundary condition is more 
obscure. It states that a fluid particle initially on the surface stays on the 
surface. Remember that we set f(x,y,t) = h(x,t) — y, so the water surface 
is given by f(x,y,t) = 0. If the surface particles are carried with the flow 
then the convective derivative of /, 

df def df , , „ w 



dy 

dh d(f) dh dcj) 
dt dy dx dx 
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should vanish on the free surface. Using v = V0 and the definition of /, this 
reduces to 

dh d(b dh dd> 

h — = 0, (7.7) 

dt dx dx dy 

which is indeed the last boundary condition. 

Using our knowledge of solutions of Laplace's equation, we can immedi- 
ately write down a wave-like solution satisfying the boundary condition at 

y = o 

(f>(x, y,t) — a cosh(%) cos(A;:r — out). (7.8) 

The tricky part is satisfying the remaining two boundary conditions. The 
difficulty is that they are non-linear, and so couple modes with different 
wave-numbers. We will circumvent the difficulty by restricting ourselves to 
small amplitude waves, for which the boundary conditions can be linearized. 
Suppressing all terms that contain a product of two or more small quantities, 
we are left with 

d f t +gh = 0, (7.9) 
dh 86 , 

m~i = °- (7 - 10) 

Because 6 is a already a small quantity, and the wave amplitude is a small 
quantity, linearization requires that these equations should be imposed at 
the equilibrium surface of the fluid y = h . It is convenient to eliminate h to 

get 

d 2 6 86 , . , 

w + g dy~ = 0, on y = h °- ( 7 - n ) 

Inserting (7.8) into this boundary condition leads to the dispersion equation 

uj 1 = gktanhkho, (7-12) 

relating the frequency to the wave-number. 
Two limiting cases are of particular interest: 

i) Long waves on shallow water: Here kh 1, and, in this limit, 



UJ 



ii) Waves on deep water: Here, kh ^> 1, leading to uj — \fgk. 
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For deep water, the velocity potential becomes 

<j>{x, y, t) = ae Hy - ho) cos(kx - ut). (7.13) 

We see that the disturbance due to the surface wave dies away exponentially, 
and becomes very small only a few wavelengths below the surface. 

Remember that the velocity of the fluid is v = V</>. To follow the motion 
of individual particles of fluid we must solve the equations 

-ake kiy - ho) sm{kx-ut), 

ake kiy - ho) cos(kx-ujt). (7.14) 

This is a system of coupled non-linear differential equations, but to find 
the small amplitude motion of particles at the surface we may, to a first 
approximation, set x = xq, y = h on the right-hand side. The orbits of the 
surface particles are therefore approximately 

x(t) = Xq cos(/cxo — cot), 

y{t) = y sm(kx -ujt). (7.15) 



dx 

~di 

dy 

dt 



* y 



Figure 7.2: Circular orbits in deep water surface waves. 

For right-moving waves, the particle orbits are clockwise circles. At the 
wave-crest the particles move in the direction of the wave propagation; in 
the troughs they move in the opposite direction. Figure 7.2 shows that this 
motion results in an up-down-asymmetric cycloidal wave profile. 

When the effect of the bottom becomes significant, the circular orbits 
deform into ellipses. For shallow water waves, the motion is principally back- 
and-forth with motion in the y direction almost negligeable. 
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7.1.2 Group Velocity 

The most important effect of dispersion is that the group velocity of the waves 

- the speed at which a wave-packet travels — differs from the phase velocity 

- the speed at which individual wave-crests move. The group velocity is 
also the speed at which the energy associated with the waves travels. 

Suppose that we have waves with dispersion equation uj = uj(k). A right- 
going wave-packet of finite extent, and with initial profile <p(x), can be Fourier 
analyzed to give 

<p(x) = I ^A(k)e ikx . (7.16) 



oc 




Figure 7.3: A right-going wavepacket. 



At later times this will evolve to 

/°° Hk 
—A{k)e ikx - lLu{k)t . (7.17) 
-oc 2vr 

Let us suppose for the moment that A{k) is non-zero only for a narrow band 
of wavenumbers around k , and that, restricted to this narrow band, we can 
approximate the full cu(k) dispersion equation by 

u(k) ^cu + U(k- k ). (7.18) 

Thus 

-DO 271 

Comparing this with the Fourier expression for the initial profile, we find 
that 

<p(x, t) = e-* (fJ °- Ukoyt <p(x - Ut). (7.20) 
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The pulse envelope therefore travels at speed U. This velocity 

°'m ^ 

is the group velocity. The individual wave crests, on the other hand, move 
at the phase velocity u(k)/k. 

When the initial pulse contains a broad range of frequencies we can still 
explore its evolution. We make use of a powerful tool for estimating the be- 
havior of integrals that contain a large parameter. In this case the parameter 
is the time t. We begin by writing the Fourier representation of the wave as 

<p{x,t) = f ^A(k)e^ (7.22) 

where 

ip(k) — k (^j - u(k). (7.23) 

Now look at the behaviour of this integral as t becomes large, but while we 
keep the ratio x/t fixed. Since t is very large, any variation of ip with k 
will make the integrand a very rapidly oscillating function of k. Cancellation 
between adjacent intervals with opposite phase will cause the net contribution 
from such a region of the k integration to be very small. The principal 
contribution will come from the neighbourhood of stationary phase points, 
i.e. points where 

This means that, at points in space where x/t = U, we will only get contri- 
butions from the Fourier components with wave-number satisfying 

U-% ,7,5) 

The initial packet will therefore spread out, with those components of the 
wave having wave-number k travelling at speed 

Ugroup = (7.26) 

This is the same expression for the group velocity that we obtained in the 
narrow-band case. Again this speed of propagation should be contrasted 
with that of the wave-crests, which travel at 

^phasc = p (7.27) 
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The "stationary phase" argument may seem a little hand-waving, but it can 
be developed into a systematic approximation scheme. We will do this in 
chapter ??. 

Example: Water Waves. The dispersion equation for waves on deep water is 
lj = \J~gk. The phase velocity is therefore 



^phasc = y |> ( 7 - 28 ) 



whilst the group velocity is 



"^group 2 2 Vphase ' (7.29) 

This difference is easily demonstrated by tossing a stone into a pool and 
observing how individual wave-crests overtake the circular wave packet and 
die out at the leading edge, while new crests and troughs come into being at 
the rear and make their way to the front. 

This result can be extended to three dimensions with 



«U> = ^ (7-30) 



Example: de Broglie Waves. The plane-wave solutions of the time- dependent 
Schrodinger equation 

are 

i) = e ik - r "^, (7.32) 

with 

«(*) = If- ("3) 

The group velocity is therefore 

Vgroup = — k, (7.34) 



which is the classical velocity of the particle. 
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7.1.3 Wakes 

There are many circumstances when waves are excited by object moving at 
a constant velocity through a background medium, or by a stationary object 
immersed in a steady flow. The resulting wakes carry off energy, and therefore 
create wave drag. Wakes are involved, for example, in sonic booms, Cerenkov 
radiation, the Landau criterion for superfluidity, and Landau damping of 
plasma oscillations. Here, we will consider some simple water-wave analogues 
of these effects. The common principle for all wakes is that the resulting wave 
pattern is time independent when observed from the object exciting it. 
Example: Obstacle in a Stream. Consider a log lying submerged in a rapidly 
flowing stream. 




Figure 7.4: Log in a stream. 

The obstacle disturbs the water and generates a train of waves. If the log lies 
athwart the stream, the problem is essentially one-dimensional and easy to 
analyse. The essential point is that the distance of the wavecrests from the log 
does not change with time, and therefore the wavelength of the disturbance 
the log creates is selected by the condition that the phase velocity of the wave, 
coincide with the velocity of the mean flow. 1 The group velocity does come 
into play, however. If the group velocity of the waves is less that the phase 
velocity, the energy being deposited in the wave-train by the disturbance will 
be swept downstream, and the wake will lie behind the obstacle. If the group 
velocity is higher than the phase velocity, and this is the case with very short 
wavelength ripples on water where surface tension is more important than 
gravity, the energy will propagate against the flow, and so the ripples appear 
upstream of the obstacle. 

In his book Waves in Fluids, M. J. Lighthill quotes Robert Frost on this phenomenon: 

The black stream, catching on a sunken rock, 
Flung backward on itself in one white wave, 
And the white water rode the black forever, 
Not gaining but not losing. 
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/// 
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/// 
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Figure 7.5: Kelvin's ship-wave construction. 



Example: Kelvin Ship Waves. A more subtle problem is the pattern of waves 
left behind by a ship on deep water. The shape of the pattern is determined 
by the group velocity for deep-water waves being one-half that of the phase 
velocity. 

How the wave pattern is formed can be understood from figure 7.5. In 
order that the pattern of wavecrests be time independent, the waves emitted 
in the direction AC must have phase velocity such that their crests travel 
from A to C while the ship goes from A to B. The crest of the wave emitted 
from the bow of the ship in the direction AC will therefore lie along the line 
BC — or at least there would be a wave crest on this line if the emitted 
wave energy travelled at the phase velocity. The angle at C must be a right 
angle because the direction of propagation is perpendicular to the wave- 
crests. Euclid, by virtue of his angle-in-a-semicircle theorem, now tells us 
that the locus of all possible points C (for all directions of wave emission) 
is the larger circle. Because, however, the wave energy only travels at one- 
half the phase velocity, the waves going in the direction AC actually have 
significant amplitude only on the smaller circle, which has half the radius of 
the larger. The wake therefore lies on, and within, the Kelvin wedge, whose 
boundary lies at an angle 9 to the ship's path. This angle is determined by 
the ratio OD/OB=l/3 to be 



Remarkably, this angle, and hence the width of the wake, is independent of 
the speed of the ship. 

The waves actually on the edge of the wedge are usually the most promi- 



9 = wjT\1/3) = 19.5°. 



(7.35) 
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Figure 7.6: Large-scale Kelvin wakes. (Image source: US Navy) 



nent, and they will have crests perpendicular to the line AD. This orientation 
is indicated on the left hand figure, and reproduced as the predicted pattern 
of wavecrests on the right. The prediction should be compared with the wave 
systems in figures 7.6 and 7.7. 

7.1.4 Hamilton's Theory of Rays 

We have seen that wave packets travel at a frequency-dependent group ve- 
locity. We can extend this result to study the motion of waves in weakly 
inhomogeneous media, and so derive an analogy between the "geometric op- 
tics" limit of wave motion and classical dynamics. 

Consider a packet composed of a roughly uniform train of waves spread 
out over a region that is substantially longer and wider than their mean wave- 
length. The essential feature of such a wave train is that at any particular 
point of space and time, x and t, it has a definite phase 0(x, t). Once we 
know this phase, we can define the local frequency u and wave-vector k by 




(7.36) 
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These definitions are motivated by the idea that 

9(x, t) ~ k • x - ut, (7.37) 

at least locally. 

We wish to understand how k changes as the wave propagates through a 
slowly varying medium. We introduce the inhomogeneity by assuming that 
the dispersion equation u = cu(k), which is initially derived for a uniform 
medium, can be extended to u = u(k, x), where the x dependence arises, 
for example, as a result of a position-dependent refractive index. This as- 
sumption is only an approximation, but it is a good approximation when the 
distance over which the medium changes is much larger than the distance 
between wavecrests. 

Applying the equality of mixed partials to the definitions of k and u gives 

us 




(7.38) 



The subscripts indicate what is being left fixed when we differentiate. We 
must be careful about this, because we want to use the dispersion equation 
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to express w as a function of k and x, and the wave-vector k will itself be a 
function of x and t. 

Taking this dependence into account, we write 



duo \ ( duj\ ( 3uj\ ( dkj 



(7.39) 



We now use (7.38) to rewrite this as 



Interpreting the left hand side as a convective derivative 

dki ( dki 
lit ~ \ dt 

we read off that 

dki ( du 



<Wi ( 9 4) Mv ;/ .V)/,. 



dt \ dxi 
provided we are moving at velocity 



(7.41) 



*=W'=U),- (7 ' 42) 

Since this is the group velocity, the packet of waves is actually travelling at 
this speed. The last two equations therefore tell us how the orientation and 
wavelength of the wave train evolve if we ride along with the packet as it is 
refracted by the inhomogeneity. 



The formulae 



du 

k = — — — 



<9x ' 

*- w ^ 

are Hamilton's ray equations. These Hamilton equations are identical in form 
to Hamilton's equations for classical mechanics 

OH 

p = 
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except that k is playing the role of the canonical momentum, p, and uj(k, x) 
replaces the Hamiltonian, H(p, x). This formal equivalence of geometric 
optics and classical mechanics was mystery in Hamilton's time. Today we 
understand that classical mechanics is nothing but the geometric optics limit 
of wave mechanics. 



7.2 Making Waves 

Many waves occurring in nature are generated by the energy of some steady 
flow being stolen away to drive an oscillatory motion. Familiar examples 
include the music of a flute and the waves raised on the surface of water by 
the wind. The latter process is quite subtle and was not understood until the 
work of J. W. Miles in 1957. Miles showed that in order to excite waves the 
wind speed has to vary with the height above the water, and that waves of 
a given wavelength take energy only from the wind at that height where the 
windspeed matches the phase velocity of the wave. The resulting resonant 
energy transfer turns out to have analogues in many branches of science. In 
this section we will exhibit this phenomenon in the simpler situation where 
the varying flow is that of the water itself. 



7.2.1 Rayleigh's Equation 

Consider water flowing in a shallow channel where friction forces keep the 
water in contact the stream-bed from moving. We will show that the resulting 
shear flow is unstable to the formation of waves on the water surface. The 
consequences of this instability are most often seen in a thin sheet of water 
running down the face of a dam. The sheet starts off flowing smoothly, but, 
as the water descends, waves form and break, and the water reaches the 
bottom in irregular pulses called roll waves. 

It is easiest to describe what is happening from the vantage of a reference 
frame that rides along with the surface water. In this frame the velocity 
profile of the flow will be as shown in figure 7.8. 
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Figure 7.8: The velocity profile U(y) in a frame at which the surface is at 
rest. 

Since the flow is incompressible but not irrotational, we will describe the 

motion by using a stream function in terms of which the fluid velocity is 
given by 

V X = -dyV, 

v y = d x V. (7.45) 

This parameterization automatically satisfies V • v = 0, while the (z compo- 
nent of) the vorticity becomes 

Q = d x v y - d y v x = V 2 ^. (7.46) 

We will consider a stream function of the form 2 

y, t) = Mv) + Hy)e lkx ~ luJt , (7.47) 

where ipo obeys —d y ipo = v x = U(y), and describes the horizontal mean flow. 
The term containing ip(y) represents a small-amplitude wave disturbance 
superposed on the mean flow. We will investigate whether this disturbance 
grows or decreases with time. 

Euler's equation can be written as, 

HvxO = -V ^ + y + fflj =0. (7.48) 

Taking the curl of this, and taking into account the two dimensional character 
of the problem, we find that 

d t Q + (v ■ V)fi = 0. (7.49) 
2 The physical stream function is, of course, the real part of this expression. 
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This, a general property of two-dimensional incompressible motion, says that 
vorticity is convected with the flow. We now express (7.49) in terms of 
when it becomes 

V 2 ^ + (v V)V 2 ^ = 0. (7.50) 

Substituting the expression (7.47) into (7.50), and keeping only terms of first 
order in ip, gives 

-iu (J^ - k 2 ^ ip + lUk (J^ - k 2 ^j ip + tk^d y (-d y U) = 0, 



or 



d 2 



- k 2 U - 



d 2 u 



-i> = o. 



(7.51) 



dy 2 J r \dy 2 J (U-u/k) 

This is Rayleigh's equation. 3 If only the first term were present, it would 
have solutions ip oc e ±hy , and we would have recovered the results of section 
7.1.1. The second term is significant, however. It will diverge if there is a 
point y c such that U(y c ) — uj/k. In other words, if there is a depth at which 
the flow speed coincides with the phase velocity of the wave disturbance, thus 
allowing a resonant interaction between the wave and flow. An actual infinity 
in (7.51) will be evaded, though, because uj will gain a small imaginary part 
uj — > lur + ry. A positive imaginary part means that the wave amplitude is 
growing exponentially with time. A negative imaginary part means that the 
wave is being damped. With 7 included, we then have 



1 



U - uj R /k 



(U -u/k) 



(U-iu R /k) 2 + 1 2 
U - u R /k 



+ in s, 



{U-u R /kf + 1 



2 + vn sgn ( — 



gn (T)6(u(y)-u R /k) 

o{y-y c )- 



dy 



Vc 



(7.52) 



To specify the problem fully we need to impose boundary conditions on 
ip(y). On the lower surface we can set ip(0) = 0, as this will keep the fluid 
at rest there. On the upper surface y = h we apply Euler's equation 



v+vxfi=-VP+ 



gh 



0. 



(7.53) 



3 Lord Raylcigh, "On the stability or instability of certain fluid motions." Proc. Lond. 
Math. Soc. 11 (1880). 
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We observe that P is constant, being atmospheric pressure, and the v 2 /2 can 
be neglected as it is of second order in the disturbance. Then, considering 
the x component, we have 



-V x gh = -gd x 



1 J fk 2 \ 
Vydt — —g I — \ ip 
\ iuj ' 



(7.54) 



on the free surface. To lowest order we can apply the boundary condition on 
the equilibrium free surface y = z/q. The boundary condition is therefore 



1 dip kdU _ k 2 
ip dy uj dy ® ' uj 2 



y = yo- 



We usually have dU /dy = near the surface, so this simplifies to 



1 dip 
ip dy 



k 2 



9- 



(7.55) 



(7.56) 



That this is sensible can be confirmed by considering the case of waves on 
still, deep water, where ip(y) = e' fe ' y . The boundary condition then reduces 
to \k\ = gk 2 /u 2 , or uj 2 = g\k\, which is the correct dispersion equation for 
such waves. 

We find the corresponding dispersion equation for waves on shallow flow- 
ing water by computing 

^_(hp_ 

ip dy 

from Rayleigh's equation (7.51). Multiplying by ip* and integrating gives 

1 



(7.57) 



yo 







ryo ( 



iP* 



d 2 \ f d 2 U 



An integration by parts then gives 



iP' 



dip_ 
dy 



yo 



dy 



dip 



dy 



+ 



dy 2 J (uj- Uk) 



8 2 U 



dy 2 J (U-uj/k) 



(7.58) 



(7.59) 



The lower limit makes no contribution, since ip* is zero there. On using (7.52) 
and taking the imaginary part, we find 



d 2 U 
dy 2 



He: 



dU 



dy 



my c )l 



(7.60) 
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or 



[ml 4^ 



7T 



ch/ 2 



9C7 



|^(j/c 



Wlfo)| 5 



(7.61) 



yo ^ ~ a ' y c 

This equation is most useful if the interaction with the flow does not sub- 
stantially perturb i]){y) away from the still- water result if){y) = sinh(|A;|y), 
and assuming this is so provides a reasonable first approximation. 
If we insert (7.61) into (7.56), where we approximate, 
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We see that either sign of 7 is allowed by our analysis. Thus the resonant 
interaction between the shear flow and wave appears to lead to either ex- 
ponential growth or damping of the wave. This is inevitable because our 
inviscid fluid contains no mechanism for dissipation, and its motion is neces- 
sarily time-reversal invariant. Nonetheless, as in our discussion of "friction 
without friction" in section 5.2.2, only one sign of 7 is actually observed. 
This sign is determined by the initial conditions, but a rigorous explanation 
of how this works mathematically is not easy, and is the subject of many 
papers. These show that the correct sign is given by 



7 = 



CO 



-TV 
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2gk 2 \ dy 
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dU 



dy 



yc 



(7.63) 



Since our velocity profile has d 2 U /dy 2 < 0, this means that the waves grow 
in amplitude. 

We can also establish the correct sign for 7 by a computing the change of 
momentum in the background flow due to the wave. 4 The crucial element is 
whether, in the neighbourhood of the critical depth, more fluid is overtaking 
the wave than lagging behind it. This is exactly what the the quantity 
d 2 U/dy 2 measures. 



4 G. E. Vekstein, "Landau resonance mechanism for plasma and wind-generated water 
waves," American Journal of Physics, 66 (1998) 886-92. 
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7.3 Non- linear Waves 

Non-linear effects become important when some dimensionless measure of 
the amplitude of the disturbance, say AP/P for a sound wave, or Ah/X for 
a water wave, is no longer <C 1. 



7.3.1 Sound in Air 

The simplest non-linear wave system is one-dimensional sound propagation 
in a gas. This problem was studied by Riemann. 

The one dimensional motion of a fluid is determined by the mass conser- 
vation equation 

d tP + d x (pv)=0, (7.64) 
and Euler's equation of motion 

p(d t v + vd x v) = -d x P. (7.65) 

In a fluid with equation of state P = P(p), the speed of sound, c, is given by 

c 2 = ^. (7.66) 
dp 

It will in general depend on P, the speed of propagation being usually higher 
when the pressure is higher. 

Riemann was able to simplify these equations by defining a new thermo- 
dynamic variable tt(P) as 

f P 1 

7r= / —dP, (7.67) 

JPo PC 

were Po is the equilibrium pressure of the undisturbed air. The quantity n 
obeys 

dir 1 

-— - = — . 7.68 
dP pc K ' 

In terms of n, Euler's equation divided by p becomes 

d t v + vd x v + cd x n = 0, (7.69) 
whilst the equation of mass conservation divided by p/c becomes 

d t n + v d x n + cd x v = 0. (7.70) 
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Adding and subtracting, we get Riemann's equations 

d t (v + ir) + {v + c)d x (v +tt) = 0, 

d t (v - tt) + {v - c)d x {v - tt) = 0. (7.71) 

These assert that the Riemann invariants v±tt are constant along the char- 
acteristic curves 

% = V ±, (7.72) 

This tell us that signals travel at the speed v ± c. In other words, they travel, 
with respect to the fluid, at the local speed of sound c. Using the Riemann 
equations, we can propagate initial data v(x, t = 0), ir(x, t = 0) into the 
future by using the method of characteristics. 

In figure 7.9 the value of v + tt is constant along the characteristic curve C+, 
which is the solution of 

f-„ + e (7.73) 

passing through A. The value of v — tc is constant along C; 8 , which is the 
solution of 

!-„-e (7.74) 

passing through B. Thus the values of 7r and v at the point P can be found if 
we know the initial values of v + tt at the point A and v — tt at the point B. 
Having found v and 7r at P we can invert vr(P) to find the pressure P, and 
hence c, and so continue the characteristics into the future, as indicated by 
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p 



Figure 7.10: Simple wave characteristics. 



the dotted lines. We need, of course, to know v and c at every point along 
the characteristics C+ and in order to construct them, and this requires 
us to to treat every point as a "P". The values of the dynamical quantities 
at P therefore depend on the initial data at all points lying between A and 
B. This is the domain of dependence of P 

A sound wave caused by a localized excess of pressure will eventually 
break up into two distinct pulses, one going forwards and one going back- 
wards. Once these pulses are sufficiently separated that they no longer inter- 
act with one another they are simple waves. Consider a forward-going pulse 
propagating into undisturbed air. The backward characteristics are coming 
from the undisturbed region where both tt and v are zero. Clearly tt — v is 
zero everywhere on these characteristics, and so tt = v . Now tt + v = 2v = 2tt 
is constant the forward characteristics, and so tt and v are individually con- 
stant along them. Since tt is constant, so is c. With v also being constant, 
this means that c + v is constant. In other words, for a simple wave, the 
characteristics are straight lines. 

This simple-wave simplification contains within it the seeds of its own 
destruction. Suppose we have a positive pressure pulse in a fluid whose 
speed of sound increases with the pressure. Figure 7.10 shows how, with 
this assumption, the straight-line characteristics travel faster in the high 
pressure region, and eventually catch up with and intersect the slower-moving 
characteristics. When this happens the dynamical variables will become 
multivalued. How do we deal with this? 
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7.3.2 Shocks 

Let us untangle the multivaluedness by drawing another set of pictures. Sup- 
pose u obeys the non-linear "half" wave equation 

(d t + ud x )u = 0. (7.75) 

The velocity of propagation of the wave is therefore u itself, so the parts of 
the wave with large u will overtake those with smaller u, and the wave will 
"break," as shown in figure 7.11 





Figure 7.11: A breaking non-linear wave. 



Physics does not permit such multivalued solutions, and what usually hap- 
pens is that the assumptions underlying the model which gave rise to the 
nonlinear equation will no longer be valid. New terms should be included in 
the equation which prevent the solution becoming multivalued, and instead 
a steep "shock" will form. 




Figure 7.12: Formation of a shock. 
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Examples of an equation with such additional terms are Burgers' equation 

(d t + ud x )u = vd 2 xx u, (7.76) 

and the Korteweg de-Vries (KdV) equation (4.11), which, by a suitable rescal- 
ing of x and t, we can write as 

(d t + ud x )u = 5d xxx u. (7.77) 

Burgers' equation, for example, can be thought of as including the effects of 
thermal conductivity, which was not included in the derivation of Riemann's 
equations. In both these modified equations, the right hand side is negligeable 
when u is varying slowly, but it completely changes the character of the 
solution when the waves steepen and try to break. 

Although these extra terms are essential for the stabilization of the shock, 
once we know that such a discontinuous solution has formed, we can find 
many of its properties — for example the propagation velocity — from general 
principles, without needing their detailed form. All we need is to know what 
conservation laws are applicable. 

Multiplying (d t + ud x )u = by -u™" 1 , we deduce that 

°t \- un \+ 9 X \zr^T un+1 \ = 0, (7-78) 

and this implies that 



n I I n + 1 



oo 



Q n = / u n dx (7.79) 



oc 



is time independent. There are infinitely many of these conservation laws, 
one for each n. Suppose that the n-th conservation law continues to hold even 
in the presence of the shock, and that the discontinuity is at X(t). Then 



dt 



/ u n dx+ u 11 dx \ = 0. (7.80) 

J-oo JX(t) | 



This is equal to 



r-X(t) poo 

u n _(X)X - u n + (X)X + I d t u n dx+ / d t u n dx = 0, (7.81) 

) Jx(t) 
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where u n _(X) = u n (X-e) &ndu n + (X) = u n (X+e). Now, using (d t +ud x )u = 
in the regions away from the shock, where it is reliable, we can write this as 

(ul-u n )X = / d x u n dx / d x u n dx 

1 + " n + n+lj x{t) 

u n+l _ u n+ly ^g 2 ) 



n 



n + 1 

The velocity at which the shock moves is therefore 

f n \ « +1 - u n+1 ) 

X = ^ — -L. 7.83 

\n + I J {u n + - u n J) y ' 

Since the shock can only move at one velocity, only one of the infinitely many 
conservation laws can continue to hold in the modified theory! 
Example: Burgers' equation. From 

(d t + ud x )u = ud 2 xx u, (7.84) 

we deduce that 

d t u + d x \Ux 2 -ud x u^ =0, (7.85) 

so that Q\ = fudx is conserved, but further investigation shows that no 
other conservation law survives. The shock speed is therefore 

Example: KdV equation. From 

(d t + ud x )u = 5dl x u : (7.87) 

we deduce that 

d t u + d x ^-u 2 -5d 2 xx u^ = o, 

d t\\ u2 \ +d x \\u z -5ud 2 xx u + \5{d x u) 2 \ = 
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where the dots refer to an infinite sequence of (not exactly obvious) conserva- 
tion laws. Since more than one conservation law survives, the KdV equation 
cannot have shock-like solutions. Instead, the steepening wave breaks up 
into a sequence of solitons. 
Example: Hydraulic Jump, or Bore 



° 


V 2 ^ 






o h 

2 



Figure 7.13: A Hydraulic Jump. 

A stationary hydraulic jump is a place in a stream where the fluid abruptly 
increases in depth from hi to h%, and simultaneously slows down from super- 
critical (faster than wave-speed) flow to subcritical (slower than wave-speed) 
flow. Such jumps are commonly seen near weirs, and white-water rapids. 5 A 
circular hydraulic jump is easily created in your kitchen sink. The moving 
equivalent is the the tidal bore 

The equations governing uniform (meaning that v is independent of the 
depth) flow in channels are mass conservation 

d t h + d x {hv} = 0, (7.88) 

and Euler's equation 

d t v + vd x v = -d x {gh}. (7.89) 

We could manipulate these into the Riemann form, and work from there, but 
it is more direct to combine them to derive the momentum conservation law 

d t {hv} + d x \hv 2 + i#/i 2 j = 0. (7.90) 

From Euler's equation, assuming steady flow, v = 0, we can also deduce 
Bernoulli's equation 

-v 2 + gh = const., (7-91) 

5 The breaking crest of Frost's "white wave" is probably as much as an example of a 
hydraulic jump as of a smooth downstream wake. 
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which is an energy conservation law. At the jump, mass and momentum 
must be conserved: 

h 1 v 1 = h 2 v 2 , 
hiv\ + -gh\ = h 2 v 2 2 + ^gh 2 2 , (7.92) 

and v 2 may be eliminated to find 

Vl = \ 9 (|) {kl + h) - (7 - 93) 

A change of frame reveals that v\ is the speed at which a wall of water of 
height h = (h 2 — hi) would propagate into stationary water of depth hi. 

Bernoulli's equation is inconsistent with the two equations we have used, 
and so 

2^i 2 + 9^1 ^ -£>\ + 9h 2 - (7.94) 

This means that energy is being dissipated: for strong jumps, the fluid down- 
stream is turbulent. For weaker jumps, the energy is radiated away in a train 
of waves - the so-called "undular bore" . 

Example: Shock Wave in Air: At a shock wave in air we have conservation 
of mass 

PiVi = p2V 2 , (7.95) 

and momentum 

Piv\ + P x =p 2 vl + P 2 . (7.96) 
In this case, however, Bernoulli's equation does hold, 6 , so we also have 

\v\ + hi= X -v\ + h 2 . (7.97) 

Here, h is the specific enthalpy (U + PV per unit mass). Entropy, though, is 
not conserved, so we cannot use PV 1 = const, across the shock. From mass 

6 Recall that enthalpy is conserved in a throttling process even in the presence of dissi- 
pation. Bernoulli's equation for a gas is the generalization of this thermodynamic result to 
include the kinetic energy of the gas. The difference between the shock wave in air, where 
Bernoulli holds, and the hydraulic jump, where it does not, is that the enthalpy of the gas 
keeps track of the lost mechanical energy, which has been absorbed by the internal degrees 
of freedom. The Bernoulli equation for channel flow keeps track only of the mechanical 
energy of the mean flow. 
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and momentum conservation alone we find 

> 2 \ P2-P1 



Pi J 92- Pi 



(7.98) 



For an ideal gas with c p /c v = 7, we can use energy conservation to to elimi- 
nate the densities, and find 



Vl = c Q \\l + — p — . (7.99) 



Here, c is the speed of sound in the undisturbed gas. 



7.3.3 Weak Solutions 

We want to make mathematically precise the sense in which a function u 
with a discontinuity can be a solution to the differential equation 

d t \-u n \+ d x {-^—u n+1 \ = 0, (7.100) 



n J [ n + 1 

even though the equation is surely meaningless if the functions to which the 
derivatives are being applied are not in fact differentiable. 

We could play around with distributions like the Heaviside step function 
or the Dirac delta, but this is unsafe for non-linear equations, because the 
product of two distributions is generally not meaningful. What we do is 
introduce a new concept. We say that u is a weak solution to (7.100) if 

f dxdtlu n d t cp + ^— u n+1 d x ip\ = 0, (7.101) 
Jri I n + 1 J 

for all test functions tp in some suitable space T . This equation has formally 
been obtained from (7.100) by multiplying it by tp(x,t), integrating over 
all space-time, and then integrating by parts to move the derivatives off u, 
and onto the smooth function tp. If u is assumed smooth then all these 
manipulations are legitimate and the new equation (7.101) contains no new 
information. A conventional solution to (7.100) is therefore also a weak 
solution. The new formulation (7.101), however, admits solutions in which u 
has shocks. 
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Figure 7.14: The geometry of the domains to the right and left of a jump. 

Let us see what is required of a weak solution if we assume that u is 
everywhere smooth except for a single jump from tt_(i) to u + (t) at the point 
X(t). Let D± be the regions to the left and right of the jump, as shown in 
figure 7.14. Then the weak-solution condition (7.101) becomes 

= y dxdt^u n d t <p + ^^u n+1 d x ip^+J dxdt^u n d t ip + -^-yM n+1 a^J. 

(7.102) 

Let 

(7.103) 

be the unit outward normal to D_, then, using the divergence theorem, we 
have 

J dxdt!^u n d tV + —^u n+1 d x ^ = J dxdt!^- v (^d t u n + -^-jd x u n+1 ^ 

(7.104) 

Here we have written the integration measure over the boundary as 

ds = \Jl + \X\ 2 dt. (7.105) 

Performing the same manoeuvre for D + , and observing that if can be any 
smooth function, we deduce that 



n 



1 + 1*1 



-X 



1 + 1*1 
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i) d t u n + ^d x u n+1 = within D±. 

ii) X(ul -V) = - u n _ +1 ) on X(t). 

The reasoning here is identical to that in chapter one, where we considered 
variations at endpoints to obtain natural boundary conditions. We therefore 
end up with the same equations for the motion of the shock as before. 

The notion of weak solutions is widely used in applied mathematics, and it 
is the principal ingredient of the finite element method of numerical analysis 
in continuum dynamics. 



7.4 Solitons 

A localized disturbance in a dispersive medium soon falls apart, since its 
various frequency components travel at differing speeds. At the same time, 
non-linear effects will distort the wave profile. In some systems, however, 
these effects of dispersion and non-linearity can compensate each other and 
give rise to solitons — stable solitary waves which propagate for long distances 
without changing their form. Not all equations possessing wave-like solutions 
also possess solitary wave solutions. The best known example of equations 
that do, are: 

1) The Korteweg-de-Vries (KdV) equation, which in the form 

has a solitary wave solution 

u = 2a 2 sech 2 (aa; - aH) (7.107) 

which travels at speed a 2 . The larger the amplitude, therefore, the 
faster the solitary wave travels. This equation applies to steep waves 
in shallow water. 

2) The non-linear Shrodinger (NLS) equation with attractive interactions 

where A > 0. It has solitary- wave solution 

i) = e ikx ~ luJt y^sechv^(a; - Ut), (7.109) 
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where 



k = mU, u = -mU 2 - —. (7.110) 
2 2m K J 



In this case, the speed is independent of the amplitude, and the moving 
solution can be obtained from a stationary one by means of a Galilean 
boost. The nonlinear equation for the stationary wavepacket may be 
solved by observing that 

{-d 2 x - 2sechV)Vo = -ipo (J. Ill) 

where ipo(x) = sechx. This is the bound-state of the Poschel- Teller 
equation that we have met several times before. The non-linear Schrodinger 
equation describes many systems, including the dynamics of tornadoes, 
where the solitons manifest as the knot-like kinks sometimes seen wind- 
ing their way up thin funnel clouds. 7 
3) The sine-Gordon (SG) equation is 

£-g+T-*- a (7 - 112) 

This has solitary-wave solutions 

ip(x,t) = ^tan- 1 { e W-^)} ; (7. 113 ) 

where 7 = (1 — U 2 )~^ and \U\ < 1. The velocity is not related to the 
amplitude, and the moving soliton can again be obtained by boosting 
a stationary soliton. The boost is now a Lorentz transformation, and 
so we only get subluminal solitons, whose width is Lorentz contracted 
by the usual relativistic factor of 7. The sine-Gordon equation de- 
scribes, for example, the evolution of light pulses whose frequency is in 
resonance with an atomic transition in the propagation medium. 8 
In the case of the sine-Gordon soliton, the origin of the solitary wave is 
particularly easy to understand, as it can be realized as a "twist" in a chain 
of coupled pendulums. The handedness of the twist determines whether we 
take the + or — sign in the solution (7.113). 



7 H.Hasimoto, J. Fluid Mech. 51 (1972) 477-485. 

8 See G. L. Lamb, Rev. Mod. Phys. 43(1971) 99, for a nice review. 
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Figure 7.15: A sine-Gordon solitary wave as a twist in a ribbon of coupled 
pendulums. 

The existence of solitary- wave solutions is interesting in its own right. 
It was the fortuitous observation of such a wave by John Scott Russell on 
the Union Canal, near Hermiston in England, that founded the subject. 9 
Even more remarkable was Scott Russell's subsequent discovery (made in a 
specially constructed trough in his garden) of what is now called the soliton 
property: two colliding solitary waves interact in a complicated manner yet 
emerge from the encounter with their form unchanged, having suffered no 
more than a slight time delay. Each of the three equations given above has 
exact multi- soliton solutions which show this phenomenon. 

After languishing for more than a century, soliton theory has grown to 
be a huge subject. It is, for example, studied by electrical engineers who 
use soliton pulses in fibre-optic communications. No other type of signal 
can propagate though thousands of kilometers of undersea cable without 
degradation. Solitons, or "quantum lumps" are also important in particle 
physics. The nucleon can be thought of as a knotted soliton (in this case 
called a "skyrmion") in the pion field, and gauge-field monopole solitons 

9 "I was observing the motion of a boat which was rapidly drawn along a narrow channel 
by a pair of horses, when the boat suddenly stopped - not so the mass of water in the 
channel which it had put in motion; it accumulated round the prow of the vessel in a state 
of violent agitation, then suddenly leaving it behind, rolled forward with great velocity, 
assuming the form of a large solitary elevation, a rounded, smooth and well-defined heap 
of water, which continued its course along the channel apparently without change of form 
or diminution of speed. I followed it on horseback, and overtook it still rolling on at a rate 
of some eight or nine miles an hour, preserving its original figure some thirty feet long and 
a foot to a foot and a half in height. Its height gradually diminished, and after a chase of 
one or two miles I lost it in the windings of the channel. Such, in the month of August 
1834, was my first chance interview with that singular and beautiful phenomenon which I 
have called the Wave of Translation." — John Scott Russell, 1844. 
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appear in many string and field theories. The soliton equations themselves 
are aristocrats among partial differential equations, with ties into almost 
every other branch of mathematics. 

Practical Illustration: Solitons in Optical Fibres. We wish to transmit pi- 
cosecond pulses of light with a carrier frequency u . Suppose that the dis- 
persive properties of the fibre are such that the associated wavenumber for 
frequencies near u can be expanded as 

k = Ak + ko + p x {u - u ) + \l3 2 {uo - cu ) 2 + ■■■. (7.114) 

Here, f3\ is the reciprocal of the group velocity, and f3 2 is a parameter called 
the group velocity dispersion (GVD). The term Ak parameterizes the change 
in refractive index due to non-linear effects. It is proportional to the mean- 
square of the electric field. Let us write the electric field as 

E{x,t) = A{x,t)e ikoZ -^\ (7.115) 

where A(x, t) is a slowly varying envelope function. When we transform from 
Fourier variables to space and time we have 

d d 
(u - u ) i—, - fc ) -> (7.116) 

and so the equation determining A becomes 

.dA _ . dA _ (3 2 d 2 A 

'~dz ~ l ^~dJ~Y~di 

If we set Ak = "f\A 2 \, where 7 is normally positive, we have 



-£^ + AJfcA (7.117) 



. [dA , dA\ S-,i)-.\ 



We may get rid of the first-order time derivative by transforming to a frame 
moving at the group velocity. We do this by setting 

t — t — f3iZ, 

C = z (7.119) 

and using the chain rule, as we did for the Galilean transformation in home- 
work set 0. The equation for A ends up being 

dA (3 2 d 2 A ,,,0, . 
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This looks like our non-linear Schrodinger equation, but with the role of 
space and time interchanged! Also, the coefficient of the second derivative 
has the wrong sign so, to make it coincide with the Schrodinger equation we 
studied earlier, we must have (3 2 < 0. When this condition holds, we are 
said to be in the "anomalous dispersion" regime — although this is rather 
a misnomer since it is the group refractive index, N g = c/v gronp , that is 
decreasing with frequency, not the ordinary refractive index. For pure SiC>2 
glass, fa is negative for wavelengths greater than 1.27/zm. We therefore have 
anomalous dispersion in the technologically important region near 1.55 /xm, 
where the glass is most transparent. In the anomalous dispersion regime we 
have solitons with 




— sechv^(r), (7.121) 

7 



sech Va(t - /3 1 z)e iaWz/2 e ikoZ - iuot . (7.122 



This equation describes a pulse propagating at P 1 1 , which is the group ve- 
locity. 

Exercise 7.1: Find the expression for the sine-Gordon soliton, by first showing 
that the static sine-Gordon equation 

d 2 ip m 2 

implies that 



-if + -—j cos pip = const., 



1 ,2 , m 

2^ + J 2 

and solving this equation (for a suitable choice of the constant) by separation 
of variables. Next, show that if f(x) is solution of the static equation, then 
/(7(x — Ut)), 7 = (1 - U 2 )^ 1 / 2 , \U\ < 1 is a solution of the time-dependent 
equation. 

Exercise 7.2: Lax pair for the non-linear Schrodinger equation. Let L be the 
matrix differential operator 



L = 



id x X* 
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and let P be the matrix 

*\X? x'* 



P 

Show that the equation 



-x' -i\x\ 



L=[L,P] 

is equivalent to the non-linear Shrodinger equation 

i X = -X" ~ 2|x| 2 X- 

7.5 Further Exercises and Problems 

Here are some further problems on non-linear and dispersive waves: 

Problem 7.3: The Equation of Telegraphy. Oliver Heaviside's equations re- 
lating the voltage v(x,t) and current i(x,t) in a transmission line are 

^di dv 

dt dx ' 

gdv ^ di 

dt dx 

Here R, C, L and G are respectively the resitance, capacitance, inductance, 
and leakance of each unit length of the line. 

a) Show that Heaviside's equations lead to v(x,t) obeying 

<9 2 t> dv d^v 

and also to a similar equation for i(x,t). 

b) Seek a travelling-wave solution of the form 

v(x,t) = voe***-**), 
i(x,t) = i e i ^ kx - ut \ 

and find the dispersion equation relating u> and k. From this relation, 
show that signals propagate undistorted (i.e. with frequency- independent 
attenuation) at speed 1/v 7 LC provided that the Heaviside condition 
RC = LG is satisfied. 
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Show that the characteristic impedance Z 
line is given by 



vq/iq of the transmission 



Z(uj) 



R + iuL 
G + iuC 



Deduce that the characteristic impedance is frequency independent if the 
Heaviside condition is satisfied. 

In practical applications, the Heaviside condition can be satisfied by periodi- 
cally inserting extra inductors — known as loading coils — into the line. 

Problem 7.4: Pantograph Drag. A high-speed train picks up its electrical 
power via a pantograph from an overhead line. The locomotive travels at 
speed U and the pantograph exerts a constant vertical force F on the power 
line. 
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Figure 7.16: A high-speed train. 



We make the usual small amplitude approximations and assume (not unrealis- 
tically) that the line is supported in such a way that its vertical displacement 
obeys an inhomogeneous Klein-Gordon equation 

py-Ty" + pfl 2 y = F8(x-Ut), 

with c = y^T/p, the velocity of propagation of short-wavelength transverse 
waves on the overhead cable. 

a) Assume that U < c and solve for the steady state displacement of the 
cable about the pickup point. (Hint: the disturbance is time-independent 
when viewed from the train.) 

b) Now assume that U > c. Again find an expression for the displacement 
of the cable. (The same hint applies, but the physically appropriate 
boundary conditions are very different!) 

c) By equating the rate at which wave-energy 



E = j {\ P y 2 + \t V i2 + dx 
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is being created to rate at the which the locomotive is doing work, cal- 
culate the wave-drag on the train. In particular, show that there is no 
drag at all until U exceeds c. (Hint: While the front end of the wake is 
moving at speed U, the trailing end of the wake is moving forward at the 
group velocity of the wave-train.) 
d) By carefully considering the force the pantograph exerts on the overhead 
cable, again calculate the induced drag. You should get the same answer 
as in part c) (Hint: To the order needed for the calculation, the tension 
in the cable is the same before and after the train has passed, but the 
direction in which the tension acts is different. The force F is therefore 
not exactly vertical, but has a small forward component. Don't forget 
that the resultant of the forces is accelerating the cable.) 

This problem of wake formation and drag is related both to Cerenkov radiation 
and to the Landau criterion for superfluidity. 

Exercise 7.5: Inertial waves. A rotating tank of incompressible (p = 1) fluid 
can host waves whose restoring force is provided by angular momentum con- 
servation. Suppose the fluid velocity at the point r is given by 

v(r,t) = u(r,t) + n x r, 

where u is a perturbation imposed on the rigid rotation of the fluid at angular 
velocity CI. 

a) Show that when viewed from a co-ordinate frame rotating with the fluid 
we have 

du ( du _ .._ . N 

— = — - n x u + ((n x r) • V)u 



dt \dt , lab 

Deduce that the lab-frame Euler equation 

_ + (vV)v = -VP, 
becomes, in the rotating frame, 

^ + 2(n x u) + (u • v)u = -v (p - i|0 x r| 2 ^ . 

We see that in the non-inertial rotating frame the fluid experiences a 
— 2(fi x u) Coriolis and a V|fi x r| 2 /2 centrifugal force. By linearizing 
the rotating-frame Euler equation, show that for small u we have 



^ - 2(n ■ v)u = o, (*) 



where u = curl u. 
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b) Take fi to be directed along the z axis. Seek plane-wave solutions to * 
in the form 

u(r,t) = u e i ^ T - ut ^ 

where uo is a constant, and show that the dispersion equation for these 
small amplitude inertial waves is 

U2 



U2 I U2 I 1„2 ' 

x ' y ' K>z 

Deduce that the group velocity is directed perpendicular to k — i.e. at 
right-angles to the phase velocity. Conclude also that any slow flow 
that is steady (time independent) when viewed from the rotating frame 
is necessarily independent of the co-ordinate z. (This is the origin of 
the phenomenon of Taylor columns, which are columns of stagnant fluid 
lying above and below any obstacle immersed in such a flow.) 

Exercise 7.6: Non-linear Waves. In this problem we will explore the Riemann 
invariants for a fluid with P = \ 2 p 3 /3. This is the equation of state of one- 
dimensional non-interacting Fermi gas. 

a) From the continuity equation 

dtp + d x pv = 0, 
and Euler's equation of motion 

p(d t v + vd x v) = -d x P, 

deduce that 

(l + (A " + " ) l) (A " + '" ) - °- 
(| + <-v +•>£)<- v+«> = »• 

In what limit do these equations become equivalent to the wave equation 
for one-dimensional sound? What is the sound speed in this case? 

b) Show that the Riemann invariants v± Xp are constant on suitably defined 
characteristic curves. What is the local speed of propagation of the waves 
moving to the right or left? 

c) The fluid starts from rest, v = 0, but with a region where the density 
is higher than elsewhere. Show that that the Riemann equations will 
inevitably break down at some later time due to the formation of shock 
waves. 
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Exercise 7.7: Burgers Shocks. As simple mathematical model for the forma- 
tion and decay of a shock wave consider Burgers' Equation: 

dfU + ud x u = v d 2 j.u. 

Note its similarity to the Riemann equations of the previous exercise. The 
additional term on the right-hand side introduces dissipation and prevents the 
solution becoming multi-valued. 

a) Show that if v = any solution of Burgers' equation having a region 
where u decreases to the right will always eventually become multivalued. 

b) Show that the Hopf-Cole transformation, u = —2v d x lnip, leads to ip 
obeying a heat diffusion equation 

c) Show that 

i/>(x,t) = Ae vaH ~ ax + Be vbH - bx 

is a solution of this heat equation, and so deduce that Burgers' equation 
has a shock-wave-like solution which travels to the right at speed C = 
u(a + b) = \{ul + ur), the mean of the wave speeds to the left and right 
of the shock. Show that the width of the shock is « 4i//\ul — ur\. 
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Chapter 8 
Special Functions 



In solving Laplace's equation by the method of separation of variables we 
come across the most important of the special functions of mathematical 
physics. These functions have been studied for many years, and books such as 
the Bateman manuscript project 1 summarize the results. Any serious student 
theoretical physics needs to be familiar with this material, and should at least 
read the standard text: A Course of Modern Analysis by E. T. Whittaker 
and G. N. Watson (Cambridge University Press). Although it was originally 
published in 1902, nothing has superseded this book in its accessibility and 
usefulness. 

In this chapter we will focus only on the properties that all physics stu- 
dents should know by heart. 

8.1 Curvilinear Co-ordinates 

Laplace's equation can be separated in a number of coordinate systems. 
These are all orthogonal systems in that the local coordinate axes cross at 
right angles. 

1 The Bateman manuscript project contains the formulas collected by Harry Bateman, 
who was professor of Mathematics, Theoretical Physics, and Aeronautics at the California 
Institute of Technology. After his death in 1946, several dozen shoe boxes full of file cards 
were found in his garage. These proved to be the index to a mountain of paper contain- 
ing his detailed notes. A subset of the material was eventually published as the three 
volume series Higher Transcendental Functions, and the two volume Tables of Integral 
Transformations, A. Erdelyi et al. cds. 
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To any system of orthogonal curvilinear coordinates is associated a metric 
of the form 



This expression tells us the distance v ds 2 between the adjacent points 

(x 1 + dx 1 , x 2 + dx 2 , x 3 + dx 3 ) and (x 1 , x 2 , x 3 ). In general, the hi will depend 
on the co-ordinates x 1 . 

The most commonly used orthogonal curvilinear co-ordinate systems are 
plane polars, spherical polars, and cylindrical polars. The Laplacian also 
separates in plane elliptic, or three-dimensional ellipsoidal coordinates and 
their degenerate limits, such as parabolic cylindrical co-ordinates — but these 
are not so often encountered, and for their properties we refer the reader to 
comprehensive treatises such as Morse and Feshbach's Methods of Theoretical 
Physics. 

Plane Polar Co-ordinates 



ds 2 = hUdx 1 ) 2 + h\[dx 2 ) 2 + h 2 3 (dx 3 ) 2 . 



(8.1) 




y 




s .p 



Figure 8.1: Plane polar co-ordinates. 



Plane polar co-ordinates have metric 



ds 2 = dr 2 + r 2 d Q2 



(8.2) 



so h r = 1, h g = r. 



8.1. CURVILINEAR CO-ORDINATES 
Spherical Polar Co-ordinates 




Figure 8.2: Spherical co-ordinates. 

This system has metric 

d s 2 = dr 2 + r 2 d6 2 + r 2 sin 2 6d(j) 2 , 
so h r — 1, hg — r, htf, = rsin^, 

Cylindrical Polar Co-ordinates 




Figure 8.3: Cylindrical co-ordinates. 

These have metric 

ds 2 = dr 2 +r 2 d6 2 + dz 2 , 

so h r — 1, hg = r, /i 2 = 1. 
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8.1.1 Div, Grad and Curl in Curvilinear Co-ordinates 

It is very useful to know how to write the curvilinear co-ordinate expressions 
for the common operations of the vector calculus. Knowing these, we can 
then write down the expression for the Laplace operator. 

The gradient operator 

We begin with the gradient operator. This is a vector quantity, and to 
express it we need to understand how to associate a set of basis vectors with 
our co-ordinate system. The simplest thing to do is to take unit vectors e« 
tangential to the local co-ordinate axes. Because the coordinate system is 
orthogonal, these unit vectors will then constitute an orthonormal system. 



The vector corresponding to an infinitesimal co-ordinate displacement dx l is 
then given by 




Figure 8.4: Unit basis vectors in plane polar co-ordinates. 




(8.6) 



(8.5) 



as before. 

In the unit-vector basis, the gradient vector is 




e2 + t 






(8.8) 



which is the change in the value due the displacement. 
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The numbers (hidx 1 , h 2 dx 2 , h^dx 3 ) are often called the physical compo- 
nents of the displacement dr, to distinguish them from the numbers (dx l , dx 2 , dx 3 ) 
which are the co-ordinate components of dr. The physical components of a 
displacement vector all have the dimensions of length. The co-ordinate com- 
ponents may have different dimensions and units for each component. In 
plane polar co-ordinates, for example, the units will be meters and radians. 
This distinction extends to the gradient itself: the co-ordinate components 
of an electric field expressed in polar co-ordinates will have units of volts 
per meter and volts per radian for the radial and angular components, re- 
spectively. The factor 1/hg = r _1 serves to convert the latter to volts per 
meter. 

The divergence 

The divergence of a vector field A is defined to be the flux of A out of an 
infinitesimal region, divided by volume of the region. 




Figure 8.5: Flux out of an infinitesimal volume with sides of length hidx 1 , 
h,2dx 2 , h^dx 3 . 



In the figure, the flux out of the two end faces is 

dx 2 dx 3 [Ai/^l^i+tteVvs) _ ^i^3|(xW 3 )] ~ dx l dx 2 dx 3 ^^f^~ • 

(8.9) 

Adding the contributions from the other two pairs of faces, and dividing by 
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the volume, h^h^h^dx^ dx l ! dx 3 ', gives 

div A = — 1— ( ^-(h 2 h 3 A x ) + JL(hxh 3 A 2 ) + ^-(h^A,)) . (8.10) 
h\hih?, [oxi 0x2 0x3 J 

Note that in curvilinear coordinates div A is no longer simply V • A, although 
one often writes it as such. 



The curl 

The curl of a vector field A is a vector whose component in the direction of 
the normal to an infinitesimal area element, is line integral of A round the 
infinitesimal area, divided by the area. 




h dx l 
1 

Figure 8.6: Line integral round infinitesimal area with sides of length hidx 1 , 
h<idx 2 , and normal e% . 

The third component is, for example, 

, ... 1 fdh 2 A 2 dhiAA 

The other two components are found by cyclically permuting 1 — > 2 — ^ 3 — > 1 
in this formula. The curl is thus is no longer equal to V x A, although it is 
common to write it as if it were. 

Note that the factors of hi are disposed so that the vector identities 

curlgrady? = 0, (8.12) 

and 

div curl A = 0, (8.13) 
continue to hold for any scalar field (p, and any vector field A. 
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8.1.2 The Laplacian in Curvilinear Co-ordinates 

The Laplacian acting on scalars, is "div grad" , and is therefore 

-_, 2 1 f ® ( h 2 h 3 dcp\ d ( h\hz d(p\ d ( h\h 2 dtp 

V V? = 1-1-1- 1 -ET + -KT\ + 



hih 2 fi3 { dxi \ hi dxi J dx 2 \ h 2 dx 2 J dx% \ hs 8x3 

(8.14) 

This formula is worth committing to memory. 

When the Laplacian is to act on a vector field, we must use the vector 
Laplacian 

V 2 A = grad div A - curl curl A. (8.15) 

In curvilinear co-ordinates this is no longer equivalent to the Laplacian acting 
on each component of A, treating it as if it were a scalar. The expression 
(8.15) is the appropriate generalization of the vector Laplacian to curvilinear 
co-ordinates because it is defined in terms of the co-ordinate independent 
operators div, grad, and curl, and reduces to the Laplacian on the individual 
components when the co-ordinate system is Cartesan. 

In spherical polars the Laplace operator acting on the scalar field ip is 

^ 2 1 d { 2 dip\ 1 Of. ,dip\ 1 <9V 



r 2 dr \ dr J r 2 sin 9 d6 \ d6 J r 2 sin 2 6 dcf) 2 
1«9 2 M + 1_ iJ_9_( shld ^A , 1 d\ 



where 



r dr 2 r 2 \ sin 9 d9 \ d9 J sin 6 d(j) 2 
ld 2 (r V ) L 2 

--&T-^<P> ( 8 - 16 ) 



1 d d Id 2 . 

■^e--—^—^ (8.17) 



sin 9d9 d6 sin 2 6 d(p 2 ' 



is (after multiplication by ti 2 ) the operator representing the square of the 
angular momentum in quantum mechanics. 
In cylindrical polars the Laplacian is 

V 2 - i— r — — — — (8 18) 

r dr dr r 2 d9 2 dz 2 
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8.2 Spherical Harmonics 

We saw that Laplace's equation in spherical polars is 

Id 2 (rep) L 2 
°=--^-^- ( 8 - 19 ) 

To solve this by the method of separation of variables, we factorize 

tp = R(r)Y(9,<f>), (8.20) 

so that 

;L 2 F = 0. (8.21) 



Rr dr 2 r 2 \Y 
Taking the separation constant to be 1(1 + 1), we have 

r^j^--l{l+l){rR) = ^ (8.22) 
dr z 

and 

L 2 Y = 1(1 + 1)Y. (8.23) 

The solution for R is r l or r~ l_1 . The equation for F can be further decom- 
posed by setting Y = 0(9) $(</>). Looking back at the definition of L 2 , we see 
that we can take 

$(0) = e im<f> (8.24) 
with m an integer to ensure single valuedness. The equation for 9 is then 

1 d { dO\ m 2 

/ sin^)--^-e = -/(/ + 1)0. (8.25) 



2 



sin#d#V / sin 2 # 

It is convenient to set x = cos 9; then 

i. ( W ); ! + i(i + 1) __^_je = 0. (8,6) 
8.2.1 Legendre Polynomials 

We first look at the axially symmetric case where m — 0. We are left with 

-^-(l-x 2 )^- + /(/ + 1)^1 = 0. (8.27) 
ax ax / 
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This is Legendre's equation. We can think of it as an eigenvalue problem 

- Q^(i - * 2 )^) e(x) = i(i + i)e(s), (8.28) 

on the interval — 1 < x < 1, this being the range of cos # for real 6. Legendre's 
equation is of Sturm-Liouville form, but with regular singular points at x = 
±1. Because the endpoints of the interval are singular, we cannot impose 
as boundary conditions that 0, O', or some linear combination of these, be 
zero there. We do need some boundary conditions, however, so as to have a 
self-adjoint operator and a complete set of eigenf unctions. 

Given one or more singular endpoints, a possible route to a well-defined 
eigenvalue problem is to require solutions to be square-integrable, and so 
normalizable. This condition suffices for the harmonic-oscillator Schrodinger 
equation, for example, because at most one of the two solutions is square- 
integrable. For Legendre's equation with I = 0, the two independent solutions 
are 9(x) = 1 and Q(x) = ln(l + x) — ln(l — x). Both of these solutions have 
finite L 2 [— 1, 1] norms, and this square integrability persists for all values of 
I. Thus, demanding normalizability is not enough to select a unique bound- 
ary condition. Instead, each endpoint possesses a one-parameter family of 
boundary conditions that lead to self-adjoint operators. We therefore make 
the more restrictive demand that the allowed eigenfunctions be finite at the 
endpoints. Because the the north and south pole of the sphere are not special 
points, this is a physically reasonable condition. When / is an integer, then 
one of the solutions, Pi(x), becomes a polynomial, and so is finite at x — ±1. 
The second solution Qi(x) is divergent at both ends, and so is not an allowed 
solution. When / is not an integer, neither solution is finite. The eigenvalues 
are therefore /(/ + 1) with / zero or a positive integer. Despite its unfa- 
miliar form, the "finite" boundary condition makes the Legendre operator 
self-adjoint, and the Legendre polynomials Pi(x) form a complete orthogonal 
set for L 2 [-l,l]. 

Proving orthogonality is easy: we follow the usual strategy for Sturm- 
Liouville equations with non-singular boundary conditions to deduce that 

[/(/ + l)_ m(m + l)]^ P l (x)P m {x)dx= [(P^-P/P m )(l-z 2 )]^. 

(8.29) 

Since the P/'s remain finite at ±1, the right hand side is zero because of the 
(1 — x 2 ) factor, and so f Pi(x)P m (x) dx is zero if I ^ m. (Observe that 
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this last step differs from the usual argument where it is the vanishing of the 
eigenfunction or its derivative that makes the integrated-out term zero.) 

Because they are orthogonal polynomials, the Pi(x) can be obtained by 
applying the Gram-Schmidt procedure to the sequence l,x,x 2 , ... to obtain 
polynomials orthogonal with respect to the w = 1 inner product, and then 
fixing the normalization constant. The result of this process can be expressed 
in closed form as 

This is called Rodriguez' formula. It should be clear that this formula outputs 
a polynomial of degree /. The coefficient 1/2'/! comes from the traditional 
normalization for the Legendre polynomials that makes Pi(l) = 1. This 
convention does not lead to an orthonormal set. Instead, we have 



Pi(x)P m (x) dx = y^pi&im- (8.31) 



It is easy to show that this integral is zero if I > m — simply integrate by 
parts / times so as to take the / derivatives off (x 2 — l) 1 and onto (x 2 — l) m , 
which they kill. We will evaluate the I = m integral in the next section. 

We now show that the Pi(x) given by Rodriguez formula are indeed so- 
lutions of Legendre's equation: Let v = (x 2 — 1)\ then 

(1 - x 2 )v + 2lxv = 0. (8.32) 

We differentiate this / + 1 times using Leibniz' theorem 

[ UU ]W = ( n \u {m) v {n - m) 

m=0 

= uv {n) + n«'u (n_1) + -n{n - l)u"v^ + .... (8.33) 



We find that 



[(l.^ypi) = (i- s y+2)_(| + i) 2OT (w)-;(| + i), 



(0 



[2xnvf +1) = 2xlv<- l+r >+2l(l + i)v®. (8.34) 
Putting these two terms together we obtain 

O 1 -^- 2 ^^^ 1 ')^ 2 - 1 ^ - (8 ' 35) 
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which is Legendre's equation. 

The Pi(x) have alternating parity 

Pt(-x) = (-l) l Pi(x), (8.36) 

and the first few are 



Pol 




= 1, 




Pli 




= x, 




P1\ 


» 


- > 2 - 


1), 






= 2-(5x 3 - 


3x), 




:*) 




- 30x 2 + 3) 



8.2.2 Axisymmetric potential problems 

The essential property of the Pi(x) is that the general axisymmetric solution 
of V 2 (f = can be expanded in terms of them as 

00 

ip(r, 6)=J2 (V + Bir- 1 - 1 ) fl(cos 9). (8.37) 
2=0 

You should memorize this formula. You should also know by heart the ex- 
plicit expressions for the first four Pi(x), and the factor of 2/(2/ + 1) in the 
orthogonality formula. 

Example: Point charge. Put a unit charge at the point R, and find an ex- 
pansion for the potential as a Legendre polynomial series in a neighbourhood 
of the origin. 



\R-r\ 




Figure 8.7: Geometry for generating function. 
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Let start by assuming that |r| < |R|. We know that in this region the point 
charge potential l/|r — R| is a solution of Laplace's equation , and so we can 
expand 

T ===__ = £ Vfl(cos0). (8.38) 
\Jr 2 + R 2 — 2rR cos ^ 



r-R 



We knew that the coefficients Bi were zero because ip is finite when r = 0. 
We can find the coefficients A\ by setting # = and Taylor expanding 

1 1 if / r \ / r \ 2 



|r-R| R-r R 



By comparing the two series and noting that -P/(l) = 1, we find that A\ 
R- 1 - 1 . Thus 



1 



Vr 2 + R 2 -2rRcos9 



i 00 i 



This last expression is the generating function formula for Legendre polyno- 
mials. It is also a useful formula to have in your long-term memory. 
If |r| > |R|, then we must take 

1 , oc 

T ======= = J2B l r- l - 1 P l (cos6), (8.41) 

Vr 2 + R 2 — 2rR cos 9 7^ 



Ir-R| 



because we know that (p tends to zero when r = 00. We now set 9 = and 
compare with 



1 11/ fR\ fR x 2 



r — R r — R r \ \ r 



1+ - + - +••• , R<r, (8.42) 



to get 



1 00 / n\ I 

= -£ 7 P ^°^ R<r - 
1— n V / 



Vr 2 + R 2 - 2rRcos9 
Observe that we made no use of the normalization integral 



(8.43) 



J {P l {x)} 2 dx = 2/(2/ +1) (8.44) 
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in deriving the the generating function expansion for the Legendre polyno- 
mials. The following exercise shows that this expansion, taken together with 
their previously established orthogonality property, can be used to establish 
8.44. 

Exercise 8.1: Use the generating function for Legendre polynomials Pi(x) to 
show that 

|^(/W & ) =£ T -J_^ -ila(^f), W<1. 

By Taylor expanding the logarithm, and comparing the coefficients of z 21 , 
evaluate x { Pi (x ) } 2 dx . 

Example: A planet is spinning on its axis and so its shape deviates slightly 
from a perfect sphere. The position of its surface is given by 

R(e,<j>) = R + r}P 2 {cos6). (8.45) 

Observe that, to first order in 77, this deformation does not alter the volume 
of the body. Assuming that the planet has a uniform density p , compute 
the external gravitational potential of the planet. 




Figure 8.8: Deformed planet. 



The gravitational potential obeys Poisson's equation 

V 2 = 47rGp(x), (8.46) 
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where G is Newton's gravitational constant. We expand as a power series 
in rj 

0(r, 9) = Mr, 0) + #i(r, 9) + .... (8.47) 

We also decompose the gravitating mass into a uniform undeformed sphere, 
which gives the external potential 

0o,cxt(r, 9) = - [^KRlp^j j, r > Ro, (8.48) 

and a thin spherical shell of areal mass-density 

a{9)= p r)P 2 {cos9). (8.49) 
The thin shell gives rise to the potential 

ljint (r, 9) = Ar 2 P 2 (co S 9), r< R , (8.50) 

and 

0i,ext(r,0) = £^P2(cos0), r>R . (8.51) 
At the shell we must have 0i ; i nt = 0i )Cx t and 

<901, cx t <90i iint 



dr dr 

Thus A = BRq 5 , and 



= AnGa(9). (8.52) 



B = -^nGripoR* (8.53) 
5 



Putting this together, we have 

0(r, 9) = - QvrGpo^) \ - \ (nGrjp R$) + 0(^ 2 ), r > R . 

(8.54) 

8.2.3 General spherical harmonics 

When we do not have axisymmetry, we need the full set of spherical harmon- 
ics. These involve solutions of 

^(W^ + y-^Wo, (8,5) 
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which is the associated Legendre equation. This looks like another com- 
plicated equation with singular endpoints, but its bounded solutions can 
be obtained by differentiating Legendre polynomials. On substituting y = 
(f — x 2 ) m l 2 z(x) into (8.55), and comparing the resulting equation for z(x) 
with the m-th derivative of Legendre's equation, we find that 

Jm 

lm .r'y^ — DU) (8.56) 



dx T 



is a solution of (8.55) that remains finite (m = 0) or goes to zero (m > 0) 
at the endpoints x — ±1. Since Pi(x) is a polynomial of degree /, we must 



have P, m (x) 



if m > I. For each /, the allowed values of m in this 



formula are therefore 0,1, ... ,1. Our definition (8.56) of the Pl n {x) can be 
extended to negative integer m by interpreting cH m ' / 'dx~^ as an instruction 
to integrate the Legendre polynomial m times, instead of differentiating it, 
but the resulting P l ' m '(:r) are proportional to P; m (x), so nothing new is 
gained by this conceit. 

The spherical harmonics are the normalized product of these associated 
Legendre functions with the corresponding e im ^: 



Y l m {6,<i))ocP l l ml (cos6)e 



3 im ^, -Km<l. 



The first few are 



= 
Yf 
Y* 

Yf 1 



Y° 



i_ 

Tit 



I = 2 



Y4 



Y 1 

*2 

Y° 

Y- 1 

I 2 



KT 2 = 



^sinfle^, 

sin 9 e'^. 
iy^sin^e 2 ^, 
— y/|f sin^ cos 9e i(t> , 

^ sin 9 cos 9 e - ^, 
iy^sin^e^. 



(8.57) 
(8.58) 

(8.59) 



(8.60) 



The spherical harmonics compose an orthonormal 

/*2"7T pit 

/ d<\> \ sm9d9 [Yr(9, (j))}* Y^{9, 0) = 5 iv 5 r , 
Jo Jo 



IM) 
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and complete 

oo I 

J2 E [W>')]**T(M) =5{4>-4>')5{c^ff -cos 0) (8.62) 

/=0 m=-Z 

set of functions on the unit sphere. In terms of them, the general solution to 
VV = is 

oo Z 

¥>(r, M = EE + Bimr- 1 - 1 ) Y^, 0). (8.63) 

i=o m=-l 

This is definitely a formula to remember. 

The m — 0, the spherical harmonics are independent of the azimuthal 
angle 0, and so must be proportional to the Legendre polynomials. The 
exact relation is 

l?(M) = y^fl(cos0). (8-64) 

If we use a unit vector n to denote a point on the unit sphere, we have the 
symmetry properties 

[*?»]• = (-ir>r», vn-n) = (-lyw)- (s.es) 

These identities are useful when we wish to know how quantum mechanical 
wavefunctions transform under time reversal or parity. 
There is an addition theorem 

4-7T . ! ■ 

P,(cos 7 ) = — E [Yr(e',n*Yr(0,<t>), (8.66) 

m=— I 

where 7 is the angle between the directions (#, 0) and (9',(f>'), and is found 
from 

cos 7 = cos 6 cos + sin6 l sin^ / cos(0 — 0'). (8.67) 

The addition theorem is established by first showing that the right-hand side 
is rotationally invariant, and then setting the direction (9', 0') to point along 
the z axis. Addition theorems of this sort are useful because they allow one 
to replace a simple function of an entangled variable by a sum of functions 
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of unentangled variables. For example, the point-charge potential can be 
disentangled as 

1 OO i . / I \ 

I I 1=0 m =-l \'> / 

where r< is the smaller of |r| or |r'|, and r> is the greater and (9,<j)), (0',(/>') 
specify the direction of r, r' respectively. This expansion is derived by com- 
bining the generating function for the Legendre polynomials with the addition 
formula. It is useful for defining and evaluating multipole expansions. 

Exercise 8.2: Show that 











H 


\ 






, x-iy 


ri 1 




' (x + iy) 2 , 


Yi 




(x + iy)z, 


Y° 


> OC < 


x 2 + y 2 - 2z 2 


y-l 




(x - iy)z, 
, (x - iy) 2 , 


y-2 
I 2 J 





where x 2 + y 2 + z 2 = 1 are the usual Cartesian co-ordinates, restricted to the 
unit sphere. 



8.3 Bessel Functions 

In cylindrical polar co-ordinates, Laplace's equation is 



_o 1 d dip 1 d 2 Lp d 2 ip ,„ 
= VV = -«-»•-£ + 3^ + (8.69) 



r <9r <9r r 2 d6 2 dz 2 
If we set ip = R{r)e im ^e ±kx we find that R(r) obeys 

d 2 R ldR /,o m 2 



+ 1^7+ * -TT « = 0- ( 8 - 7 °) 



dr 2 r dr \ r 2 
Now 

^ ldy (l - - (8 71} 

is Bessel's equation and its solutions are Bessel functions of order z/. The 
solutions for R will therefore be Bessel functions of order to, but with x 
replaced by kr. 
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8.3.1 Cylindrical Bessel Functions 



We now set about solving Bessel's equation, 




(l - „(,) = 



(8.72) 



This has a regular singular point at the origin, and an irregular singular point 
at infinity. We seek a series solution of the form 



and find from the indicial equation that A = ±za Setting A = v and in- 
serting the series into the equation, we find, with a conventional choice for 
normalization, that 



Here (n+v)\ = T{n + v + l). The functions J v {x) are called cylindrical Bessel 
functions. 

If v is an integer we find that J- n {x) = (—l) n J n (x), so we have only found 
one of the two independent solutions. It is therefore traditional to define the 
Neumann function 



sin vix 

as this remains an independent second solution even as v becomes integral. 
At short distance, and for v not an integer 



y = x x (l + a-ix + a 2 x 2 H ), 



(8.73) 




(8.74) 







When v tends to zero, we have 




(8.77) 
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where 7 = .57721 . . . denotes the Euler-Mascheroni constant. For fixed u, 
and x > ^ we have the asymptotic expansions 

J v (x) ~ yXcos^-^-^^i + oQ^, (8.78) 

N V ( X ) ~ {^^ x -\^-r)( i+o (l))- ( 8 - 79 ) 

It is therefore natural to define the Hankel functions 

HW(x) = J v {x) + iN„{x) ~ «/A e i(*-W2-T/4) > ( 8 . 80 ) 

V 7TX 

#( 2 )(;r) = J/ s ) - i^fx) ~ a/ A e -*(*-W2- W /4)_ (g. 81 ) 

V 7TX 

We will derive these asymptotic forms in chapter ??. 

Generating Function 

The two-dimensional wave equation 

v2 -hw)* (r ' e > t) = (8 - 82) 

has solutions 

$ = e^e mf? J„(Ar), (8.83) 
where k = \uj\/c. Equivalently, the two dimensional Helmholtz equation 

(V 2 + A; 2 )$ = 0, (8.84) 

has solutions e m6 J n (kr). It also has solutions with J n (kr) replaced by N n (kr), 
but these are not finite at the origin. Since the e m9 J n (kr) are the only 
solutions that are finite at the origin, any other finite solution should be 
expandable in terms of them. In particular, we should be able to expand a 
plane wave solution: 

e iky = jhrsine = ^ a n e ind J n (kr) . (8.85) 
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As we will see in a moment, the a n 's are all unity, so in fact 

oo 

e ifcrsine = e me J n {kr). 



(8.86) 



This generating function is the historical origin of the Bessel functions. They 
were introduced by the astronomer Wilhelm Bessel as a method of expressing 
the eccentric anomaly of a planetary position as a Fourier sine series in the 
mean anomaly — a modern version of Hipparchus' epicycles. 
From the generating function we see that 



2tt 



-in0+ix sin 8 



de. 



(8.87) 



Whenever you come across a formula like this, involving the Fourier integral 
of the exponential of a trigonometric function, you are probably dealing with 
a Bessel function. 

The generating function can also be written as 



eK-i) = £ ej n {x). 



(8.88) 



Expanding the left-hand side and using the binomial theorem, we find 

(r + s)\ 



£ 

r+s=m 



LHS = W-T — 
^V2/ ml 

m=0 

OO CO ,T—Q 

- ££(-!>' (f v i 

r=0 s=0 



r!s! 



-iyt r t~ 



r!s! 



- E HE 



. s=0 



s!(s + n)! \2/ 



5.89) 



We recognize that the sum in the braces is the series expansion defining 
J n {x). This therefore proves the generating function formula. 



Bessel Identities 

There are many identities and integrals involving Bessel functions. The stan- 
dard reference is the monumental Treatise on the Theory of Bessel Functions 
by G. N. Watson. Here are just a few formulae for your delectation: 
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i) Starting from the generating function 

exp | \x ft - ±) | = jr J n (x)t n , (8.90) 

^ ^ ' * n=—oo 

we can, with a few lines of work, establish the recurrence relations 

2J' n (x) = J n _!(x) - J n+1 (x), (8.91) 
2n 

^ -J n (x) = J n ^(x) + J n+1 (x), (8.92) 



x 



together with 



J' Q {x) = -Mx), (8.93) 

oo 

Jn(x + y) = J r{x)Jn-r{y)- (8.94) 



ii) From the series expansion for J n (x) we find 



■%-{x n J n (x)} = x n J n - 1 {x). (8.95) 
ax 



iii) By similar methods, we find 

(1 rl \ m 
-^J {x~ n J n (x)} = {-ITx— J n+m {x). 

iv) Again from the series expansion, we find 



(8.96) 



J (ax)e~ px dx = . (8.97) 
a/o 2 + p 2 

Semi-classical picture 

The Schrodinger equation 

h 2 

-— V 2 ^ = Erj) (8.98) 

can be separated in cylindrical polar co-ordinates, and has eigenfunctions 

Mr,9) = J l (kr)e iie . (8.99) 
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-0.05 



2 00 



Figure 8.9: J w0 (x). 



The eigenvalues are E = h 2 k 2 /2m. The quantity L = hi is the angular 
momentum of the Schrodinger particle about the origin. If we impose rigid- 
wall boundary conditions that ipk,i{ r , &) vanish on the circle r = R, then the 
allowed k form a discrete set fc^ n , where Ji(k^ n R) = 0. To find the energy 
eigenvalues we therefore need to know the location of the zeros of Ji(x). 
There is no closed form equation for these numbers, but they are tabulated. 
The zeros for kR 3> / are also approximated by the zeros of the asymptotic 
expression 



Ji(kR) 



which are located at 



ki n R 



cosikR In 7t), 

nkR v 2 A h 



^7r+^7r + (2n+l)|. 



(8.100) 



(8.101) 



If we let R — > oo, then the spectrum becomes continuous and we are 
describing unconfined scattering states. Since the particles are free, their 
classical motion is in a straight line at constant velocity. A classical par- 
ticle making a closest approach at a distance r min , has angular momentum 
L = pr m i n . Since p = hk is the particle's linear momentum, we have / = A;r m i n . 
Because the classical particle is never closer than r min , the quantum me- 
chanical wavefunction representing such a particle will become evanescent 
(i.e. tend rapidly to zero) as soon as r is smaller than r min . We therefore 
expect that Ji(kr) m if kr < I. This effect is dramatically illustrated by 
the Mathematica™ plot in figure 8.9. 
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Figure 8.10: The geometric origin of x(r) and 9{r) in 8.102. 



Improved asymptotic expressions, which give a better estimate of the 
Ji(kr) zeros, are the approximations 



Here 9 = cos^ 1 (r mm /r) and x = rsin^ are functions of r. They have a 
geometric origin in the right-angled triangle in figure 8.10. The parameter 
x has the physical interpretation of being the distance, measured from from 
the point of closest approach to the origin, along the straight-line classical 
trajectory. The approximation is quite accurate once r exceeds r min by more 
than a few percent. 

The asymptotic r -1 / 2 fall-off of the Bessel function is also understandable 
in the semiclassical picture. 




(8.102) 
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Figure 8.11: A collection of trajectories, each missing the origin by r 
leaves a u hole". 




-60 -40 -20 20 40 60 

Figure 8.12: The hole is visible in the real part of ipk, 20 ( r $) = e l2oe J2o(kr) 
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By the uncertainly principle, a particle with definite angular momentum must 
have completely uncertain angular position. The wavefunction Ji(kr)e 
therefore represents a coherent superposition of beams of particles approach- 
ing from all directions, but all missing the origin by the same distance. The 
density of classical particle trajectories is infinite at r = r min , forming a caus- 
tic. By "conservation of lines" , the particle density falls off as 1/r as we move 
outwards. The particle density is proportional to \ip\ 2 , so ip itself decreases 
as r -1 / 2 . In contrast to the classical particle density, the quantum mechan- 
ical wavefunction amplitude remains finite at the caustic — the "geometric 
optics" infinity being tempered by diffraction effects. 

Exercise 8.3: The WKB (Wentzel-Kramers-Brillouin) approximation to a so- 
lution of the Schrodinger equation 



dx 2 

sets 



d 2 ib 

V + V(x)ip(x) = E^(x) 



^(x) ~ —7= exp \±i [ k(£) dQ 
Jnix) I J a J 



where k(x) = \J E — V(x), and a is some conveniently chosen constant. This 
form of the approximation is valid in classically allowed regions, where k is 
real, and away from "turning points" where k goes to zero. In a classically 
forbidden region, where k is imaginary, the solutions should decay exponen- 
tially. The connection rule that matches the standing wave in the classically 
allowed region onto the decaying solution is 



1 

exp 



1 

: COS 



\A( x ) 



[•X 

/ <0 

J a 



where a is the classical turning point. (The connection is safely made only 
in the direction of the arrow. This because a small error in the phase of the 
cosine will introduce a small admixture of the growing solution, which will 
eventually swamp the decaying solution.) 

Show that setting y(r) = r _1 / 2 ^(r) in Bessel's equation 

d 2 y _ Idy + ^y_ = k 2 
dr 2 r dr r 2 

reduces it to Schrodinger form 



d 2 i> (I 2 - 1/4) 

T~2 + 2 V = k V- 

ar z r z 
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From this show that a WKB approximation to y(r) is 

1 f f r J p 2 — b 2 1 

^) ~ (r 2_ 62) l/4 eX P | ±ik J b " d Ph r>>6 

= — eyp{+?[fc^(r) — W[r)]}, 
y/x(r) 

where kb = \Jl 2 — 1/4 k, /, and x(r) and 6>(r) were defined in connection with 
(8.102). Deduce that the expressions (8.102) are WKB approximations and 
are therefore accurate once we are away from the classical turning point at 
r = b = r min 



8.3.2 Orthogonality and Completeness 



We can write the equation obeyed by J n (kr) in Sturm-Liouville form. We 
have 



1 d 

r dr 



dy 
dr 



+ [ k — 



m 



y = 0. 



(8.103) 



Comparison with the standard Sturm-Liouville equation shows that the weight 
function, w(r), is r, and the eigenvalues are k 2 . 
From Lagrange's identity we obtain 

r R 

{k\-kl) / J m (kir)J m (k 2 r)r dr = R [k 2 Jm{kiR)J' m {k 2 R) - k x J m {k 2 R)J'Jk x R)\ . 
Jo 

(8.104) 

We have no contribution from the origin on the right-hand side because all 
J m Bessel functions except J vanish there, whilst Jq(0) = 0. For each m we 
get get a set of orthogonal functions, J m (k n x), provided the k n R are chosen 
to be roots of J m {k n R) = or J' m {k n R) = 0. 

We can find the normalization constants by differentiating (8.104) with 
respect to k\ and then setting k± = k 2 in the result. We find 



R 



J m (kr) 



r dr 



2 R 

1 

2 



J'rnikR) 



m 



Jrn{kR) 



k 2 R 2 J 

R 2 [[J n (kR)} 2 - J n _ 1 (kR)J n+1 (kR)] . (8.105) 



(The second equality follows on applying the recurrence relations for the 
J n (kr), and provides an expression that is perhaps easier to remember.) For 
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Dirichlet boundary conditions we will require k n R to be zero of J m , and so 
we have 



R 



J m (kr) 



r dr = -R 2 
2 



J'm(kR) 



.106) 



For Neumann boundary conditions we require k n R to be a zero of J' m . In 
this case 



R 



Jm(kr) 



rdr = -R 1 - 



m 



J m {kR) 



(8.107) 



L 




Figure 8.13: Cylinder geometry. 



Example: Harmonic function in cylinder. We wish to solve \7 2 V = within 
a cylinder of height L and radius a. The voltage is prescribed on the upper 
surface of the cylinder: V(r, 9, L) = U(r,9). We are told that V = on all 
other parts of boundary. 

The general solution of Laplace's equation in will be sum of terms such 

as 

J smh(kz) 1 J J m (kr) 1 J sm(m6) 
\cosh(kz) J \N m (kr) J \cos(m#) 

where the braces indicate a choice of upper or lower functions. We must take 
only the smh(kz) terms because we know that V = at z = 0, and only the 
J m (kr) terms because V is finite at r = 0. The fc's are also restricted by the 
boundary condition on the sides of the cylinder to be such that J m (ka) = 0. 
We therefore expand the prescribed voltage as 

U(r,9) = y^sinh(fc nm L)J m (A: mn r) [A nm sin(m0) + B nm cos(m0)] , (8.109) 

m,n 



(8.108) 
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and use the orthonormality of the trigonometric and Bessel function to find 
the coefficients to be 

A nm = 2C ° 2 ^ nmI :l d0 f U <T> d )Mk nm r) sin(m^) rdr, (8.110) 

7Va [J m {Knmd)\ Jo Jo 

2cosech(fc nm L) [** f a 
B nm = — 577777 717/ d0 U(r,9)J m (k nm r)cos(m9)rdr, 

(8.111) 

and 

B n o = l 2C °" e ^i kn ^l f 'd0 [ a U(r,9)Jo(k n or)rdr. (8.112) 
2na 2 [J (k n oa)\ 2 J J 

Then we fit the boundary data expansion to the general solution, and so find 
V(r,9,z) = ^ smh(k nm z)J m (k mn r) [A nm s\n(m9) + B nm cos(m9)] . (8.113) 

m,n 



Hankel Transforms 

When the radius, R, of the region in which we performing our eigenfunction 
expansion becomes infinite, the eigenvalue spectrum will become continuous, 
and the sum over the discrete k n Bessel-function zeros must be replaced by 
an integral over k. By using the asymptotic approximation 



■l,(kR) - \l^ COS ( kR -l n7r -\ 7r )^ ( 8 - 1M ) 



we may estimate the normalization integral as 



I 



R - i 2 R 

J m (kr) rdr ~ — +0(1). (8.115) 
'o L - 1 71 ~k 

We also find that the asymptotic density of Bessel zeros is 

£-! ^ 

Putting these two results together shows that the continuous-spectrum or- 
thogonality and completeness relations are 

r°° i 

/ JJkr)JJk'r)rdr = -5(k-k'), (8.117) 
Jo k 

/ J n (kr)J n (kr')kdk = -5(r-r'), (8.118) 
Jo r 
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respectively. These two equations establish that the Hankel transform (also 
called the Fourier-Bessel transform) of a function f(r), which is defined by 



F(k) 



J n (kr)f(r)rdr, 



has as its inverse 



POO 

f(r) = / J n (kr)F(k)kdk. 
Jo 



(8.119) 



(8.120) 



(See exercise 8.14 for an alternative derivation of the Hankel-transform pair.) 
Some Hankel transform pairs: 



f 

Jo 



r ar J (kr)dr 
Jo(kr 



Vk 2 + a 2 ' 



cos(ar)Jo(kr) dr 
Jo(kr) 



kdk = 



(8.121) 



\Jk 2 — a 2 



kdk 



\j\fk 2 



- cos(ar). 
r 



k < a, 
k > a. 



(8.122) 



f 

Jo 



I 



sin(ar) J (/cr) dr 
Jo(kr) 



o V 'a 2 — k 2 



kdk 



1/Va~ 2 
0, 



- sin(ar). 



k < a, 
k > a. 



5.123) 



Example: Weber's disc problem. Consider a thin isolated conducting disc of 
radius a lying on the x-y plane in M. 3 . The disc is held at potential Vq. We 
seek the potential V in the entirety of M 3 , such that V — > at infinity. 

It is easiest to first find V in the half-space z > 0, and then extend the 
solution to z < by symmetry. Because the problem is axisymmetric, we 
will make use of cylindrical polar co-ordinates with their origin at the centre 
of the disc. In the region z > the potential V(r, z) obeys 



V 2 V(r,z) 
V(r,z) 
V(r,0) 
dV 



0, 


Vo, 



z>0, 
z\ — > oo 
r < a, 



dz 



= 0, r > a. 



(8.124) 



z=0 
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I 

Jo 

roc 



This is a mixed boundary value problem. We have imposed Dirichlet boundary 
conditions on r < a and Neumann boundary conditions for r > a. 

We expand the axisymmetric solution of Laplace's equation terms of 
Bessel functions as 

POO 

V(r,z) = A(k)e- klzl J (kr)dk, (8.125) 
Jo 

and so require the unknown coeffcient function A(k) to obey 

A(k)J (kr)dk = V , r<a 

POO 

/ kA(k)J (kr)dk = 0, r > a. (8.126) 
Jo 

No elementary algorithm for solving such a pair of dual integral equations 
exists. In this case, however, some inspired guesswork helps. By integrating 
the first equation of the transform pair (8.122) with respect to a, we discover 
that 

S ^lj (kr) dr = { */% ^ I < a ' (8.127) 
J r \sm (a/k), k > a. v ' 

With this result in hand, we then observe that (8.123) tells us that the 
function 

= 2^sin(H 
jck 

satisfies both equations. Thus 

2V f°° dk 
V(r,z) = — e- klzl sm{ka)J {kr)— (8.129) 

7T Jo k 

The potential on the plane z = can be evaluated explicitly to be 

^°> = {pV)sin-'(«/ r ). rtl < 8 ' 13 °> 
The charge distribution on the disc can also be found as 

dV 



dV 




dz 


z=0- 


Wo 


fOO 




/ sin 


7T 


Jo 




7r\/a 2 — r 2 ' 



dz 



2=0-1 



r < a. (8.131) 
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8.3.3 Modified Bessel Functions 

When k is real the Bessel function J n (kr) and the Neumann N n (kr) function 
oscillate at large distance. When k is purely imaginary, it is convenient 
to combine them so as to have functions that grow or decay exponentially. 
These combinations are the modified Bessel functions I n (kr) and K n (kr). 
These functions are initially defined for non-integer v by 

I v {x) = r u J u (ix), (8.132) 

K„(x) = — —[/_„(*) -/„(*)]. (8.133) 
Z sin Z/7T 

The factor of %~ v in the definition of I v (x) is inserted to make I v real. Our 
definition of K v (x) is that in Abramowitz and Stegun's Handbook of Mathe- 
matical Functions. It differs from that of Whittaker and Watson, who divide 
by tanz/7T instead of sinz/7r. 

At short distance, and for v > 0, 



w = (f 



'x\ u 1 



When v becomes and integer we must take limits, and in particular 

I (x) = l + ^x 2 + ---, 
K (x) = -(lnx/2 + 7) + ---. 

The large x asymptotic behaviour is 

I v (x) ~ e x , x — > oo, 

V 27TX 

K v [x) ~ e ^ rr — > oo. 

V2x 



8.134) 
8.135) 



8.136) 
8.137) 



8.138) 
8.139) 



From the expression for J n (x) as an integral, we have 

I n (x) = — e ine e xcosd d9 = - cos(n9)e xcosd d9 (8.140) 
2tt Jo tt Jo 
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for integer n. When n is not an integer we still have an expression for I u (x) 
as an integral, but now it is 

I v (x) = - I cos(is9)e xcos0 d9-^^ e- xcosht - ut dt. (8.141) 
Jo n Jo 

Here we need |argx| < ir/2 for the second integral to converge. The origin 
of the "extra" infinite integral must remain a mystery until we learn how 
to use complex integral methods for solving differential equations. From the 
definition of K v (x) in terms of I u we find 

poo 

K u (x)= e~ xcosht cosh(vt)dt, \aigx\ < tt/2. (8.142) 
Jo 

Physics Illustration: Light propagation in optical fibres. Consider the propa- 
gation of light of frequency uj down a straight section of optical fibre. Typical 
fibres are made of two materials. An outer layer, or cladding, with refractive 
index n 2 , and an inner core with refractive index n\ > n 2 . The core of a fibre 
used for communication is usually less than 10/im in diameter. 

We will treat the light field E as a scalar. (This is not a particularly good 
approximation for real fibres, but the complications due the vector character 
of the electromagnetic field are considerable.) We suppose that E obeys 

8 2 E 8 2 E 8 2 E n 2 (x,y)d 2 E ;) 
dx 2 dy 2 dz 2 c 2 dt 2 

Here n(x, y) is the refractive index of of the fibre, which is assumed to lie 
along the z axis. We set 

E(x, y, z, t) = ip(x, y, z ) e ik ^-^ (8.144) 

where k = cu /c. The amplitude if) is a (relatively) slowly varying envelope 
function. Plugging into the wave equation we find that 

d 2 ip d 2 ip d 2 ip dip f n 2 {x,y) 2 2 \ to-t A r.\ 

dx^ + W + dz^ + 2lk °dz- + [-^^ ~ k o)^ = V- (8-145) 

Because ip is slowly varying, we neglect the second derivative of ip with 
respect to z, and this becomes 

2iko Tz = -{& 2 + w) i) + kl{1 ' n2{x ' y)) (8 - 146) 



8.3. BESSEL FUNCTIONS 



327 



which is the two-dimensional time-dependent Schrodinger equation, but with 
t replaced by z/2ko, where z is the distance down the fibre. The wave-modes 
that will be trapped and guided by the fibre will be those corresponding to 
bound states of the axisymmetric potential 

V(x, y) = -n 2 (r)). (8.147) 

If these bound states have (negative) "energy" E n , then ip oc e ~ lE « z / 2k o^ anc i 
so the actual wavenumber for frequency u is 

k = k - E n /2k . (8.148) 

In order to have a unique propagation velocity for signals on the fibre, it 
is therefore necessary that the potential support one, and only one, bound 
state. 
If 

n(r) — rii, r < a, 

= 712, r > a, (8.149) 

then the bound state solutions will be of the form 
where 

= (nlkl-(3 2 ), (8.151) 

7 2 = (/3 2 -n^ 2 ). (8.152) 

To ensure that we have a solution decaying away from the core, we need (3 
to be such that both k and 7 are real. We therefore require 



3 2 

n l > I2 > n 2- (8.153) 
k 

At the interface both if) and its radial derivative must be continuous, and so 
we will have a solution only if (5 is such that 



K- 



'J n (Ka) K n (ja)' 

This Shrodinger approximation to the wave equation has other applica- 
tions. It is called the paraxial approximation. 
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8.3.4 Spherical Bessel Functions 

Consider the wave equation 

V 2 -^|^M,<M) = (8.154) 

in spherical polar coordinates. To apply separation of variables, we set 

V = e^ t Y l m (6,<j ) )x(r), (8.155) 

and find that 

d 2 X 2d X 1(1 + 1) ^ 

+ 2 — X + —X = 0. (8.156) 

ar z r ar r z c z 

Substitute x — T _1 / 2 i?(r) and we have 

^ + i^ +( V_£+JA^ a (8157) 

ar z r ar \ <r r A ) 

This is Bessel's equation with z/ 2 — > (I + |) 2 . Therefore the general solution 
is 

i? = AJ l+ i (kr) + B J_i_i (kr) , (8.158) 

where k = \oj\/c. Now inspection of the series definition of the J v reveals 
that 

Ji(x) = \ —sinx, (8.159) 
2 V nx 

J_i(x) = \ —cosx, (8.160) 

2 V KX 

so these Bessel functions are actually elementary functions. This is true of 
all Bessel functions of half-integer order, v = ±1/2, ±3/2, . . .. We define the 
spherical Bessel functions by 2 



(8.161) 



ni (x) = (-iy+i/J-J_ (l+1 ( x ). (8.162) 



7T 




2 We are using the definitions from Schiff's Quantum Mechanics. 
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The first few are 



j (x) = -sinx, 
x 

■f\ 1 • 1 
ji(x) = —smx cosx, 

X 2 X 

J2{x) = I o sinx -cosx, 

\X d X/ X A 

n o{ x ) — cosx, 

X 

nWx) = -cosx smx, 

X 1 X 

< \ f 31 \ 3 • 

n 2 (x ) = I r cos x sm x. 

\x J x/ x" 1 

Despite the appearance of negative powers of x, the ji(x) are all finite at 
x = 0. The ni(x) all diverge to — oo as x — > 0. In general 



ji (a?) = /i (a?) sin a; + gi (x) cos(x) , 
ni(x) = -fi(x) cos(x) + gi(x) sinx, 

where /;(x) and 57 (x) are polynomials in 1/x. 

We also define the spherical Hankel functions by 

hf\x) = ji(x) +in l (x), 

hf\x) = ji(x) -im{x). 

These behave like 

^ 2) (x) ~ I e -*(*-P+i]*/2) 



x 



8.163) 
8.164) 



8.165) 
8.166) 



8.167) 
8.168) 



at large x. 

The solution to the wave equation regular at the origin is therefore a sum 
of terms such as 

<p ktl , m (r, 9, 0, t) = ji(kr)Yr(e, <\>)e^\ (8.169) 
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where lj = ±ck, with k > 0. For example, the plane wave e has expansion 



Skz 



Akr cos 9 



J2(M + ty l ji(kr)Pi(cosi 

1=0 

or equivalently, using (8.66), 

oo I 



(8.170) 



ikr 



(8.171) 



1=0 m=-l 



where k, f, unit vectors in the direction of k and r respectively, are used as 
a shorthand notation to indicate the angles that should be inserted into the 
spherical harmonics. This angular-momentum-adapted expansion of a plane 
wave provides a useful tool in scattering theory. 

Exercise 8.4: PeierPs Problem. Critical Mass. The core of a nuclear device 
consists of a sphere of fissile 235 U of radius R. It is surrounded by a thick shell 
of non-fissile material which acts as a neutron reflector, or tamper. 




Figure 8.14: Fission core. 

In the core, the fast neutron density n(r, t) obeys 

^ = un + D F V 2 n. (8.172) 
at 

Here the term with v accounts for the production of additional neutrons due to 
induced fission. The term with Dp describes the diffusion of the fast neutrons. 
In the tamper the neutron flux obeys 

— = D T V 2 n. (8.173) 
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Both the neutron density n and flux j = D^yVn, are continuous across the 
interface between the two materials. Find an equation determining the critical 
radius R c above which the neutron density grows exponentially. Show that the 
critical radius for an assembly with a tamper consisting of 238 U (Dt = Dp) is 
one-half of that for a core surrounded only by air (Dt = oo), and so the use 
of a thick 238 U tamper reduces the critical mass by a factor of eight. 



Factorization and Recurrence 

The equation obeyed by the spherical Bessel function is 



2* a + J( L M) ^ 



dx 2 x dx x 2 
or, in Sturm-Liouville form, 

1 d ( 2 d X i\ , 1(1 + 1) , 2 , ai -_. 

2l~ \ x 1~ + —X* = k 8.175 

x z ax \ ax J x z 

The corresponding differential operator is formally self-adjoint with respect 
to the inner product 

POO 

(f,g)= / (f*g)x 2 dx. (8.176) 

Jo 

Now, the operator 

d 2 2 d 1(1 + 1) . . 

A = --r- 2 --T + ^-T 1 8 - 177 

ax z x ax x z 

factorizes as 

d I - 1\ / d l + l 



dx ./ /V dx x 



or as 

d l + 2\ ( d I 



dx~^x)\dx~^x 



Since, with respect to the w = x 2 inner product, we have 

d V 1 d o d 2 



dx J x 2 dx dx x' ^ ^ 



we can write 

it a. — a. . 4t 



A = AjAi = A l+1 A] +1 , (8.181) 
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where 

^(^)- (8 ' 182) 

From this we can deduce 

Aiji oc j w , (8.183) 

A\ +1 ji oc (8.184) 

The constants of proportionality are in each case unity. The same recurrence 
formulae hold for the spherical Neumann functions n\. 

8.4 Singular Endpoints 

In this section we will exploit our knowledge of the Laplace eigenfunctions 
in spherical and plane polar coordinates to illustrate Weyl's theory of self- 
adjoint boundary conditions at singular endpoints. We also connect Weyl's 
theory with concepts from scattering theory. 



8.4.1 Weyl's Theorem 

Consider the Sturm-Liouville eigenvalue problem 

Ly = --[p(r)y'] , + q(r)y = \y (8.185) 
w 

on the interval [0,R]. Here p(r) q(r) and w(r) are all supposed real, so the 
equation is formally self-adjoint with respect to the inner product 

(u,v) w = [ wu*vdr. (8.186) 
Jo 

If r = is a singular point of (8.185), then we will be unable to impose 
boundary conditions of our accustomed form 

ay(0) + by'(0) = (8.187) 

because one or both of the linearly independent solutions y\{r) and 2/2 (r) will 
diverge as r — > 0. The range of possibilities was ennumerated by Weyl: 
Theorem (Hermann Weyl, 1910): Suppose that r = is a singular point and 
r = R a regular point of the differential equation (8.185). Then 
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I. Either: 

a) Limit-circle case: There exists a Xq such that both of the linearly 
independent solutions to (8.185) have a w norm that is convergent 
in the vicinity ofr = 0. In this case both solutions have convergent 
w norm for all values of X. 

Or 

b) limit-point case : No more than one solution has convergent w 
norm for any X. 

II. In either case, whenever ImA ^ 0, there is at least one Unite-norm 
solution. When A lies on the real axis there may or may not exist a 
finite norm solution. 

We will not attempt to prove Weyl's theorem. The proof is not difficult and 
may be found in many standard texts. 3 It is just a little more technical than 
the level of this book. We will instead illustrate it with enough examples to 
make the result plausible, and its practical consequences clear. 
When we come to construct the resolvent R\(r,r') obeying 

(L — \I)R\(r, r') — 8(r — r') (8.188) 

by writing it is a product of y < and y > we are obliged to choose a normalizable 
function for y < , the solution obeying the boundary condition at r = 0. We 
must do this so that the range of R\ will be in L 2 [0,R}. In the limit-point 
case, and when ImA ^ 0, there is only one choice for y < . There is therefore 
a unique resolvent, a unique self- adjoint operator L — XI of which R\ is the 
inverse, and hence L is a uniquely specified differential operator. 4 

In the limit-circle case there is more than one choice for y < and hence more 
than one way of making L into a self-adjoint operator. To what boundary 
conditions do these choices correspond? 

Suppose that the two normalizable solutions for A = A are yi(r) and 
y 2 (r). The essence of Weyl's theorem is that once we are sufficiently close 
to r = the exact value of A is unimportant and all solutions behave as 
a linear combination of these two. We can therefore impose as a boundary 
condition that the allowed solutions be proportional to a specified real linear 

3 For example: Ivar Stackgold Boundary Value Problems of Mathematical Physics, Vol- 
ume I (SIAM 2000). 

4 Whcn A is on the real axis then there may be no normalizable solution, and R\ cannot 
exist. This will occur only when A is in the continuous spectrum of the operator L, and is 
not a problem as the same operator L is obtained for any A. 
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combination 

y(r) oc ayi(r) + by2(r), r — > 0. (8.189) 

This is a natural generalization of the regular case where we have a solution 
yi(r) with boundary conditions yi(0) = 1, y[(0) = 0, so yi(r) ~ 1, and a 
solution y2(?") with 2/2(0) = 0, y 2 (0) = 1, so i/2( r ) ~ r - The regular self-adjoint 
boundary condition 

ay(0) + by'(0) = (8.190) 
with real a, 6 then forces j/(r) to be 

y(r) oc byi(r) — ay 2 (r) ~ 61 — ar, r — > 0. (8.191) 

Example: Consider the radial part of the Laplace eigenvalue problem in two 
dimensions. 

Lip = -±£ (V^) + ^ = feV- (8.192) 
r dr \ dr J r z 

The differential operator L is formally self-adjoint with respect to the inner 
product 

f R 

W,X)= / Vxrdr. (8.193) 
Jo 

When k 2 = 0, the m 2 7^ equation has solutions ip = r ±m , and, of the 
normalization integrals 

pR pR 

/ \r m \ 2 rdr, / \r~ m \ 2 rdr, (8.194) 

only the first, containing the positive power of r, is convergent. For m 7^ 
we are therefore in Weyl's limit-point case. For m 2 = 0, however, the k 2 = 
solutions are V'iO") — 1 an d V ; 2( r ) = l nr - Both normalization integrals 

R pR 

I 2 rdr, I \\nr\ 2 rdr (8.195) 
Jo 

converge and we are in the limit- circle case at r = 0. When k 2 > these 
solutions become 

J (kr) = l-I(A;r) 2 + ---. 

N (kr) = (lj [ln(fcr/2) + 7] + • • • . (8.196) 
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Both remain normalizable, in conformity with Weyl's theorem. The self- 
adjoint boundary conditions at r — > are therefore that near r = the 
allowed functions become proportional to 

1 + alnr (8.197) 

with a some specified real constant. 

Example: Consider the radial equation that arises when we separate the 
Laplace eigenvalue problem in spherical polar coordinates. 



4 ( Pf) + = * 2 * < 8 ' 198 > 

r z \dr dr j r z 



When k — this has solutions ip = r l , r 1 1 . For non-zero I only the first of 
the normalization integrals 

t-R i-R 

/ r 2l r 2 dr, / r" 2 ^ 2 r 2 rfr, (8.199) 
Jo Jo 

is finite. Thus, for for / ^ 0, we are again in the limit-point case, and the 
boundary condition at the origin is uniquely determined by the requirement 
that the solution be normalizable. 

When / = 0, however, the two k 2 = solutions are ip\{r) = 1 and 
ip2 {r) — 1/r. Both integrals 

*R r R 

-2„2 



/ r 2 dr, I 
Jo Jo 



r~ z r z dr (8.200) 



converge, so we are again in the limit-circle case. For positive k 2 , these 
solutions evolve into 

. , . sin kr . , , . , . cos kr 

A,ki r ) = Jo{kr) = ^ 2)fc (r) = -kn {kr) = —— (8.201) 

Near r = 0, we have ipi t k ~ 1 and ip 2 ,k ~ exactly the same behaviour as 
the A; 2 = solutions. 

We obtain a self-adjoint operator if we choose a constant a s and demand 
that all functions in the domain be proportional to 

rj}{r) ~ 1 - — (8.202) 
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as we approach r = 0. If we write the solution with this boundary condition 

as 



These boundary conditions arise in quantum mechanics when we study 
the scattering of particles whose de Broglie wavelength is much larger than 
the range of the scattering potential. The incident wave is unable to resolve 
any of the internal structure of the potential and perceives its effect only as a 
singular boundary condition at the origin. In this context the constant a s is 
called the scattering length. This physical model explains why only the / = 
partial waves have a choice of boundary condition: classical particles with 
angular momentum I ^ would miss the origin by a distance r min — l/k and 
never see the potential. 

The quantum picture also helps explain the physical origin of the dis- 
tinction between the limit-point and limit-circle cases. A point potential can 
have a bound state that extends far beyond the short range of the potential. 
If the corresponding eigenfunction is normalizable, the bound particle has 
a significant amplitude to be found at non-zero r, and this amplitude must 
be included in the completeness relation and in the eigenfunction expansion 
of the Green function. When the state is not normalizable, however, the 
particle spends all its time very close to the potential, and its eigenfunc- 
tion makes zero contribution to the Green function and completness sum at 
any non-zero r. Any admixture of this non-normalizable state allowed by 
the boundary conditions can therefore be ignored, and, as far as the exter- 
nal world is concerned, all boundary conditions look alike. The next few 
exercises will illustrate this. 

Exercise 8.5: The two-dimensional "delta-function" potential. Consider the 
quantum mechanical problem in M? 




(8.203) 



we can read off the phase shift n as 



tanr/(fc) = —ka s . 




(-V 2 + y(|r|)) ^ = Erl) 
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with V an attractive circular square well. 

v(r ) = { -A/to 2 , r <a 
V ' \0, r > a. 

The factor of to 2 has been inserted to make this a regulated version of V(r) = 
-A<5 2 (r). Let /z = ^/\J 

i) By matching the functions 



ip(r) oc 



J (fir) , r < a 
K (nr), r > a, 



at r = a, show that as a becomes small, we can scale A towards zero 
in such a way that the well becomes infinitely deep yet there remains a 
single bound state with finite binding energy 



Eq = K 2 = l e - 2 7 e -MA. 

a 2 



It is only after scaling A in this way that we have a well-defined quantum 
mechanical problem with a "point" potential, 
ii) Show that in the scaling limit, the associated wavefunction obeys the 
singular-endpoint boundary condition 

ip(r) ^1 + alnr, r^O 

where 

1 

a 



7 + hiK/2 

Observe that by varying k 2 between and oo we can make a be any 
real number. So the entire range of possible self-adjoint boundary condi- 
tions may be obtained by specifying the binding energy of an attractive 
potential. 

iii) Assume that we have fixed the boundary conditions by specifying k, and 
consider the scattering of unbound particles off the short-range potential. 
It is natural to define the phase shift rj(k) so that 

Tpki'r) = cos rj Jo (kr) — sin r/iVo (kr) 

cos(kr — 7r/4 + 7]), r — ► oo. 



irkr 

Show that 

cot rj = f — ) In k/n. 
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Exercise 8.6: The three-dimensional "delta- function" potential. Repeat the 
calculation of the previous exercise for the case of a three-dimensional delta- 
function potential 



V(r) 



-A/(47ra 3 /3), r < a 
0, r > a. 



i) Show that as we take a — > 0, the delta-function strength A can be ad- 
justed so that the scattering length becomes 

A _ 1 

n 4-7ra 2 a 

and remains finite. 

ii) Show that when this a s is positive, the attractive potential supports a 
single bound state with external wavefunction 

ip(r) oc -e~ Kr 
r 

where k = aj 1 . 

Exercise 8. 7: The pseudo-potential. Consider a particle of mass [i confined in 
a large sphere of radius R. At the center of the sphere is a singular potential 
whose effects can be parameterized by its scattering length a s and the resultant 
phase shift 

rj(k) « tan ??(&:) = —a s k. 
In the absence of the potential, the normalized I = wavefunctions would be 

ipn{r) = 



1 smk„r 



2itR r 
where k n = nir/R. 

i) Show that the presence of the singular potential perturbs the ip n eigen- 
state so that its energy E n changes by an amount 

h 2 2a s kl 



AE n = 



2p R 



ii) Show this energy shift can be written as if it were the result of applying 
first-order perturbation theory 



AE n ~ (n\V ps \n) = J d 3 r\i; n \ 2 V ps (r) 
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to an artificial pseudo-potential 



V ps (r) 




(r). 



Although the energy shift is small when R is large, it is not a first-order per- 
turbation effect and the pseudo-potential is a convenient fiction which serves 
to parameterize the effect of the true potential. Even the sign of the pseudo- 
potential may differ from that of the actual short distance potential. For our 
attractive "delta function" , for example, the pseudopotential changes from be- 
ing attractive to being repulsive as the bound state is peeled off the bottom of 
the unbound continuum. The change of sign occurs not by a s passing through 
zero, but by it passing through infinity. It is difficult to manipulate a single po- 
tential so as to see this dramatic effect, but when the particles have spin, and 
a spin-dependent interaction potential, it is possible to use a magnetic field to 
arrange for a bound state of one spin configuration to pass through the zero 
of energy of the other. The resulting Feshbach resonance has the same effect 
on the scattering length as the conceptually simpler shape resonance obtained 
by tuning the single potential. 

The pseudo-potential formula is commonly used to describe the pairwise 
interaction of a dilute gas of particles of mass m, where it reads 



The internal energy-density of the gas due to the two-body interaction then 
becomes 



where p is the particle-number density. 

The factor of two difference between the formula in the exercise and 
(8.205) arises because the \x in the exercise must be understood as the reduced 
mass n = m 2 / (m + m) — m/2 of the pair of interacting particles. 
Example: In n dimensions, the "Z = 0" part of the Laplace operator is 






This formally self adjoint with respect to the natural inner product 
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The zero eigenvalue solutions are tpi(r) = 1 and fair) — r 2 ~ n . The second 
of these ceases to be normalizable once n > 4. In four space dimensions and 
above, therefore, we are always in the limit-point case. No point interaction 
- no matter how strong — can affect the physics. This non-interaction result 
extends, with slight modification, to the quantum field theory of relativistic 
particles. Here we find that contact interactions become irrelevent or non- 
renormalizable in more than four space-time dimensions. 



Here some further problems involving Legendre polynomials, associated Leg- 
endre functions and Bessel functions: 

Exercise 8.8: A sphere of radius a is made by joining two conducting hemi- 
spheres along their equators. The hemispheres are electrically insulated from 
one another and maintained at two different potentials V\ and Vi. 

a) Starting from the general expression 



find an integral expression for the coefficients ai, b\ that are relevent to 
the electric field outside the sphere. Evaluate the integrals giving b±, 62 
and 63. 

b) Use your results from part a) to compute the electric dipole moment of 
the sphere as function of the potential difference V\ — V2. 

c) Now the two hemispheres are electrically connected and the entire surface 
is at one potential. The sphere is immersed in a uniform electric field E. 
What is its dipole moment now? 

Problem 8.9: Tides and Gravity . The Earth is not exactly spherical. Two 
major causes of the deviation from sphericity are the Earth's rotation and the 
tidal forces it feels from the Sun and the Moon. In this problem we will study 
the effects of rotation and tides on a self-gravitating sphere of fluid of uniform 
density pq. 

a) Consider the equilibrium of a nearly spherical body of fluid rotating 
homogeneously with angular velocity loq. Show that the effect of rotation 
can be accounted for by introducing an "effective gravitational potential" 



8.5 Further Exercises and Problems 




</?cff = + -UJqR 2 (P 2 (cOs9) - 1), 
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where R, 9 are spherical coordinates defined with their origin in the 
centre of the body and z along the axis of rotation, 
b) A small planet is in a circular orbit about a distant massive star. It 
rotates about an axis perpendicular to the plane of the orbit so that it 
always keeps the same face directed towards the star. Show that the 
planet experiences an effective external potential 



together with a potential, of the same sort as in part a), that arises from 
the once-per-orbit rotation. Here is the orbital angular velocity, and 
R, 9 are spherical coordinates defined with their origin at the centre of 
the planet and z pointing at the star, 
c) Each of the external potentials slightly deforms the initially spherical 
planet so that the surface is given by 



(With 9 being measured with respect to different axes for the rotation 
and tidal effects.) Show that, to first order in rj, this deformation does 
not alter the volume of the body. Observe that positive rj corresponds 
to a prolate spheroid and negative rj to an oblate one. 
d) The gravitational field of the deformed spheroid can be found by ap- 
proximating it as an undeformed homogeneous sphere of radius Ro, to- 
gether with a thin spherical shell of radius Ro and surface mass density 
a = porjP2(cos9). Use the general axisymmetric solution 



for the gravitational potential, to obtain expressions for </? s h e ii in the 
regions R > Ro and R < Ro- 
e) The surface of the fluid will be an equipotential of the combined poten- 
tials of the homogeneous sphere, the thin shell, and the effective external 
potential of the tidal or centrifugal forces. Use this fact to find rj (to 
lowest order in the angular velocities) for the two cases. Do not include 
the centrifugal potential from part b) when computing the tidal distor- 
tion. We never include the variation of the centrifugal potential across a 



Vtidai = -n 2 R 2 P 2 (cOB9) 



R(6,<f>) = Ro + vP2(cos9). 




of Laplace's equation, together with Poisson's equation 



VV = 47rG/9(r) 
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planet when calculating tidal effects. This is because this variation is due 
to the once-per-year rotation, and contributes to the oblate equatorial 

bulge and not to the prolate tidal bulge. (Answer: rj TO t = — 1 47r ° G ' o , and 

n%j _ 15 n' 2 R \ 
'/tide — 2 4vrGp '> 

Exercise 8.10: Dielectric Sphere. Consider a solid dielectric sphere of radius 
a and permittivity e. The sphere is placed in a electric field which is takes 
the constant value E = Eqz a long distance from the sphere. Recall that 
Maxwell's equations require that Dj_ and Ey be continuous across the surface 
of the sphere. 

a) Use the expansions 

$ m = ^Vfl(cosfl) 
i 

$out = ^(V + Cir-'-^fKcosfl) 
i 

and find all non-zero coefficents Ai, Bi, Cj. 

b) Show that the E field inside the sphere is uniform and of magnitude 

c) Show that the electric field is unchanged if the dielectric is replaced by 
the polarization-induced surface charge density 

o-induccd = 3e ^ g £ + 2 e ° Q ^ E cos6. 

(Some systems of units may require extra 47r's in this last expression. In 
SI units D = eE = eoE + P, and the polarization-induced charge density 

IS Pinduced = - V • P) 

Exercise 8.11: Hollow Sphere. The potential on a spherical surface of radius 
a is <&(0,4>). We want to express the potential inside the sphere as an in- 
tegral over the surface in a manner analagous to the Poisson kernel in two 
dimensions. 

a) By using the generating function for Legendre polynomials, show that 



1 - r 2 



(1 +r 2 - 2rcos6l) 3 / 2 \ 



= ^(2Z + l)r / J P / (cosfl), r<l 



5 Our earth rotates about its axis 365^ + 1 times in a year, not 365^ times. The " +1" 
is this effect. 
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b) Starting from the expansion 



OG 



<Mr,M) = E E A lm r l Yr(9,<f>) 



1=0 m=-l 




and using the addition formula for spherical harmonics, show that 




/ 

Js 2 




where cos 7 = cos 9 cos 9' + sin 9 sin 9' cos(<fi — </>'). 
c) By setting r = 0, deduce that a three dimensional harmonic function 
cannot have a local maximum or minimum. 

Problem 8.12: We have several times met with the Poschel- Teller eigenvalue 
problem 



in the particular case that n = 1. We now consider this problem for any 
positive integer n. 

a) Set £ = tanh x in * and show that it becomes 



b) Compare the equation in part a) with the associated Legendre equation 
and deduce that the bound-state eigenfunctions and eigenvalues of the 
original Poschel- Teller equation are 

tp m (x) = P™(tanhx), E m = -m 2 , m=l,...,n, 

where P™(£) is the associated Legendre function. Observe that the list 
of bound states does not include ipo = P^(tanhx) = P n (t&nhx). This is 
because ipo is not normalizable, being the lowest of the unbound E > 
continuous-spectrum states. 

c) Now seek continuous spectrum solutions to (★) in the form 



and show if we take E = k 2 , where k is any real number, then /(£) obeys 





$ k {x) = e ikx f{twhx) 



(i-e)^A + m-o^+n(n+i)f = o. 
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d) Let us denote by P^ (£) the solutions of (**) that reduce to the Legendre 
polynomial P n (0 when k = 0. Show that the first few Pn k \0 are 

p«(f) = 

p^ k \o = i(3e 2 -i-3^e-fe 2 ). 

Explore the properties of the P,i^(£), and show that they include 
i) p«(_ e) = (_i r p,(-%). 

") (n + l)P [ nU0 = ( 2n + l>Pn\0 - (n + fc 2 /")^!®- 
iii) P^l) = (1 - iife) (2 - iife) ... (n - iife)/n!. 

(ThePi%) are the z/ = — // = ifc special case of the Jacobi polynomials 

Problem 8.13: Bessel functions and impact parameters. In two dimensions we 
can expand a plane wave as 



e iky = J2 J n (kr)e ind . 



a) What do you think the resultant wave will look like if we take only a 
finite segment of this sum? For example 

17 

<j,{x) = Jn(kr)e ine . 

1=10 

Think about: 

i) The quantum interpretation of HI as angular momentum = Hkd, 
where d is the impact parameter, the amount by which the incoming 
particle misses the origin. 

ii) Diffraction: one cannot have a plane wave of finite width. 

b) After writing down your best guess for the previous part, confirm your 
understanding by using Mathematica or other package to plot the real 
part of 4> as defined above. The following Mathematica code may work. 
Clear [bit, tot] 

bit [l_,x_,y_] :=Cos[l ArcTan [x,y] ] BesselJ [l,Sqrt [x~2+y~2] ] 
tot[x_,y_] :=Sum[bit[l,x,y] ,{1,10,17}] 

ContourPlot [tot [x , y] , {x , -40 , 40} , {y , -40 , 40} , PlotPoints ->200] 
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Display ["wave" "EPS"] 

Run it, or some similar code, as a batchfile. Try different ranges for the 
sum. 

Exercise 8.14: Consider the the two-dimensional Fourier transform 

/(k) = J e lk - x /(x)d 2 x 

of a function that in polar co-ordinates is of the form f(r, 9) = exp{— U9}f(r). 
a) Show that 



____ poo 

/(k) = 2m l e- iWk / Ji{kr)f(r)rdr, 
Jo 



where k, 9k are the polar co-ordinates of k. 
b) Use the inversion formula for the two-dimensional Fourier transform to 
establish the inversion formula (8.120) for the Hankel transform 



poo 

F(k) = / Ji(kr)f(r)rdr. 
Jo 



CHAPTER 8. SPECIAL FUNCTIONS 



Chapter 9 
Integral Equations 



A problem involving a differential equation can often be recast as one involv- 
ing an integral equation. Sometimes this new formulation suggests a method 
of attack or approximation scheme that would not have been apparent in the 
original language. It is also usually easier to extract general properties of the 
solution when the problem is expressed as an integral equation. 



9.1 Illustrations 

Here are some examples: 

A boundary-value problem: Consider the differential equation for the un- 
known u(x) 

-u" + XV(x)u = (9.1) 

with the boundary conditions u(0) = u(L) = 0. To turn this into an integral 
equation we introduce the Green function 

/ n (rxiy-L), 0<x<y<L, 

G(x,y) = I f - - - 9.2 

so that 

-^G(x,y)=5(x-y). (9.3) 

Then we can pretend that XV(x)u(x) in the differential equation is a known 
source term, and substitute it for u f(x) n in the usual Green function solution. 
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We end up with 

u(x)+X f G(x,y)V(y)u(y)dx = 0. (9.4) 
Jo 

This integral equation for u has not not solved the problem, but is equivalent 
to the original problem. Note, in particular, that the boundary conditions 
are implicit in this formulation: if we set x = or L in the second term, it 
becomes zero because the Green function is zero at those points. The integral 
equation then says that u(0) and u(L) are both zero. 

An initial value problem: Consider essentially the same differential equation 
as before, but now with initial data: 

-u" + V(x)u = 0, u(0) = 0, u'(0) = 1. (9.5) 

In this case, we claim that the inhomogeneous integral equation 

px 

u(x) - (x- t)V(t)u(t) dt = x, (9.6) 
Jo 

is equivalent to the given problem. Let us check the claim. First, the initial 
conditions. Rewrite the integral equation as 

u(x)=x+ [ (x -t)V(t)u(t)dt, (9.7) 
Jo 

so it is manifest that u(0) = 0. Now differentiate to get 

u\x) = 1 + f V(t)u{t)dt. (9.8) 
Jo 

This shows that u'(0) = 1, as required. Differentiating once more confirms 
that u" = V(x)u. 

These examples reveal that one advantage of the integral equation for- 
mulation is that the boundary or initial value conditions are automatically 
encoded in the integral equation itself, and do not have to be added as riders. 



9.2 Classification of Integral Equations 



The classification of linear integral equations is best described by a list: 
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A) i) Limits on integrals fixed =>- Fredholm equation, 
ii) One integration limit is x =>■ Volterra equation. 

B) i) Unknown under integral only =>- Type I. 
ii) Unknown also outside integral =>- Type II. 

C) i) Homogeneous, 
ii) Inhomogeneous. 

For example, 

u(x) = / G(x,y)u(y)dy 
Jo 

is a Type II homogeneous Fredholm equation, whilst 

u(x) = x + f (x- t)V(t)u(t) dt 
Jo 

is a Type II inhomogeneous Volterra equation. 
The equation 



(9.9) 



(9.10) 



K(x,y)u(y) dy, 



(9.11) 



an inhomogeneous Type I Fredholm equation, is analogous to the matrix 
equation 

Kx = b. (9.12) 

On the other hand, the equation 



1 f b 

u(x) = - J K{x,y)u(y)dy, 



(9.13) 



a homogeneous Type II Fredholm equation, is analogous to the matrix eigen- 
value problem 

Kx = Ax. (9.14) 

Finally, 

(9.15) 



f(x) = / K(x,y)u(y)dy, 

J a 



an inhomogeneous Type I Volterra equation, is the analogue of a system of 
linear equations involving an upper triangular matrix. 

The function K (x, y) appearing in these in these expressions is called the 
kernel. The phrase "kernel of the integral operator" can therefore refer either 
to the function K or the nullspace of the operator. The context should make 
clear which meaning is intended. 
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9.3 Integral Transforms 

When the kernel of Fredholm equation is of the form K(x — y), with x and 
y taking values on the entire real line, then it is translation invariant and we 
can solve the integral equation by using the Fourier transformation 

/oo 
u(x)e ikx dx (9.16) 
-oo 

/°° dk 
u(k)e^ kx — (9.17) 
-oo 27T 

Integral equations involving translation-invariant Volterra kernels usually 
succumb to a Laplace transform 

poo 

u{p) = C(u)= / u(x)e- px dx (9.18) 
Jo 

1 rj+ioo 

u(x) = C-\u) = — ;/ u(p)e px dp. (9.19) 



The Laplace inversion formula is the Bromwich contour integral, where 7 is 
chosen so that all the singularities of u{p) lie to the left of the contour. In 
practice one finds the inverse Laplace transform by using a table of Laplace 
transforms, such as the Bateman tables of integral transforms mentioned in 
the introduction to chapter 8. 

For kernels of the form K(x/y) the Mellin transform, 

u(a) = M(u) = / u(x)x a - 1 dx (9.20) 
Jo 

u(x) = M-\u) = — :/ u(a)x- a da, (9.21) 

27T2 J ~f-ioo 



is the tool of choice. Again the inversion formula requires a Bromwich contour 
integral, and so usually recourse to tables of Mellin transforms. 

9.3.1 Fourier Methods 

The class of problems that succumb to a Fourier transform can be thought 
of a continuous version of a matrix problem where the entries in the matrix 
depend only on their distance from the main diagonal (figure 9.1). 
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* 


1 

i — 







m 



Figure 9.1: The matrix form of the equation K(x — y)u(y) dy = f(x) 



Example: Consider the type II Fredholm equation 



u(x) — A 



,-\*-y\ 



u{x) dx = f(x), 



where we will assume that A < 1/2. Here the x-space kernel operator 

K(x-y) = 5{x-y)-\e~\ x - y \. 

has Fourier transform 

2A k 2 + (1 - 2A) k 2 + a 2 



K(k) = 1 - 
where a 2 = 1 — 2A. From 



k 2 + l 



k 2 + a 2 
k 2 + l 



k 2 + l 
u(k) = f(k) 



k 2 + l 



(9.22) 

(9.23) 
(9.24) 

(9.25) 



we find 



u(k) = 



A: 2 



1 



= 1 + 



k 2 + a 2 
I -a 2 



f(k) 



k 2 + a 2 



/(*)■ 



(9.26) 



Inverting the Fourier transform gives 

1 ,,2 / ^ 



u (x) = f(x) + 
= f(x) + 



2a 
A 



-a\x-y\ 



f(y) dy 



f 



,-VT=2A|x-j/| 



f(y) dy. 



(9.27) 



This solution is no longer valid when the parameter A exceeds 1/2. This is 
because zero then lies in the spectrum of the operator we are attempting to 
invert. The spectrum is continuous and the Fredholm alternative does not 
apply. 
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9.3.2 Laplace Transform Methods 

The Volterra problem 

/ K(x-y)u(y)dy = f(x), < x < oo. (9.28) 
Jo 

can also be solved by the application of an integral transform. In this case 
we observe that value of K(x) is only needed for positive x, and this suggests 
that we take a Laplace transform over the positive real axis. 





// 


















Figure 9.2: We only require the value of K(x) for x positive, and u and f 
can be set to zero for x < 0. 



Abel's equation 

As an example of Laplace methods, consider Abel's equation 

f(x)= f X —^ =u ( y )dy, (9.29) 

Jo V x ~y 

where we are given f(x) and wish to find u(x). Here it is clear that we need 
/(0) =0 for the equation to make sense. We have met this integral transfor- 
mation before in the definition of the "half-derivative". It is an example of 
the more general equation of the form 

f(x)= [ X K(x-y)u(y)dy. (9.30) 
Jo 

Let us take the Laplace transform of both sides of (9.30): 

e~ px U K(x-y)u(y)dyjdx 

poo nx 

= / dx / dye- px K(x-y)u(y). (9.31) 
Jo Jo 
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Now we make the change of variables 



x 

y 



x=y 





Figure 9.3: Regions of integration for the convolution theorem: a) Integrating 
over y at fixed x, then over x; b) Integrating over n at fixed £, then over £. 



This has Jacobian 



d(x,y) 



(9.33) 



and the integral becomes 

noo 
e-^K(Ou(v)dtd V 

POO POO 

= / e~^K{i)di / e- pv u(r]) drj 

Jo Jo 
= CK(p) Cu(p). (9.34) 

Thus the Laplace transform of a Volterra convolution is the product of the 
Laplace transforms. We can now invert 



u = £-\£f/£K). 
For Abel's equation, we have 



K{x) 



1 



(9.35) 



(9.36) 
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the Laplace transform of which is 

CK{p) = jf x^- l e~ px dx = p~ 1/2 T = p- 1/2 V^- (9.37) 

Therefore, the Laplace transform of the solution u(x) is 

Cu(p) = -^p 1/2 (Cf) = -(V^P~ 1/2 P Cf). (9.38) 
V 71 " 7r 

Now /(0) = 0, and so 

p£/ = £(^/), (9.39) 

as may be seen by an integration by parts in the definition. Using this 
observation, and depending on whether we put the p next to / or outside the 
parenthesis, we conclude that the solution of Abel's equation can be written 
in two equivalent ways: 

«<*> = \l[ dv= U»' vh m dy - (9 ' 40) 

Proving the equality of these two expressions was a problem we set ourselves 
in chapter 6. 

Here is another way of establishing the equality: Assume for the moment 
that K(0) is finite, and that, as we have already noted, /(0) = 0. Then, 

±J*K{x-y)f{y)dy (9.41) 

is equal to 

K(0)f(x)+ I" d x K(x-y)f(y)dy, 
Jo 

= K(0)f(x)- f d y K(x-y)f(y)dy 
Jo 

ay(K(x-y)f(yj)dy + J K(x-y)f(y)dy 

= K(0)f(x)-K(0)f(x)-K(x)f(0)+ [ X K(x-y)f'(y)dy 

Jo 

K(x-y)f'(y)dy. (9.42) 
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Since K(0) cancelled out, we need not worry that it is divergent! More 
rigorously, we should regularize the improper integral by raising the lower 
limit on the integral to a small positive quantity, and then taking the limit 
to zero at the end of the calculation. 

Radon Transforms 



Figure 9.4: The geometry of the CAT scan Radon transformation, showing 
the location of the point P with co-ordinates x = p cos 9 — t sin 9, y = p sin 9 + 
tcos9. 

An Abel integral equation lies at the heart of the method for reconstructing 
the image in a computer aided tomography (CAT) scan. By rotating an 
X-ray source about a patient and recording the direction-dependent shadow, 
we measure the integral of his tissue density f(x,y) along all lines in a slice 
(which we will take to be the x, y plane) through his body. The resulting 
information is the Radon transform F of the function /. If we parametrize 
the family of lines by p and 9, as shown in figure 9.4, we have 



detector 




patient 



X-ray beam 




f(p cos 9 — t sin 9, p sin 9 + t cos 9) dt, 




(9.43) 
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We will assume that / is zero outside some finite region (the patient), and 
so these integrals converge. 

We wish to invert the transformation and recover / from the data F(p, 9). 
This problem was solved by Johann Radon in 1917. Radon made clever use 
of the Euclidean group to simplify the problem. He observed that we may 
take the point O at which we wish to find / to be the origin, and defined 1 

Fo(p) = tt~ / If $(xcos9 + y sin 9 - p) f(x,y) dxdy\ d9. (9.44) 
2tt J [J R 2 J 

Thus Fq{p) is the angular average over all lines tangent to a circle of ra- 
dius p about the desired inversion point. Radon then observed that if he 
additionally defines 

1 f 2n 

f(r) = — / f(r cos0,rsin0) d(p (9.45) 
2vr Jo 

then he can substitute f(r) for f(x,y) in (9.44) without changing the value 
of the integral. Furthermore /(0) = /(0, 0). Hence, taking polar co-ordinates 
in the x, y plane, he has 

Fo(p) — 77- / If 8(r cos <j) cos 9 + r sin <p sin 9 — p)f(r) rd<pdr 1 dO. 
2tt Jo [J R 2 J 

(9.46) 

We can now use 

W» = E ^7^<^ - <U (9.47) 

where the sum is over the zeros <p n of g((f>) = r cos(9 — <fi) — p, to perform the 
integral. Any given point x = rcos0, y = rsm<f) lies on two distinct lines 
if and only if p < r. Thus g((f>) has two zeros if p < r, but none if r < p. 
Consequently 

Fo{p) = i C {/°° 7p=2= /(r) rdr ) dd - (9 - 48) 

Nothing in the inner integral depends on 9. The outer integral is therefore 
trivial, and so 

MP)= / r= 2 f /(r) rdr. (9.49) 

J P V r ~ P 



1 We trust that the reader will forgive the anachronism of our expressing Radon's for- 
mula; in terms of Dirac's delta function. 
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We can extract Fq{p) from the data. We could therefore solve the Abel 
equation (9.49) and recover the complete function /(r). We are only inter- 
ested in /(0), however, and it easier verify a claimed solution. Radon asserts 
that 



/(0,0) = /(0) 



7T Jo V \dp ) 



(9.50) 



To prove that his claim is true we must first take the derivative of i*b(p) an d 
show that 

[£ F °®)=r w^{l-M ir - (9 - 51) 

The details of this computation are left as an exercise. It is little different 
from the differentiation of the integral transform at the end of the last section. 
We then substitute (9.51) into (9.50) and evaluate the resulting integral 



(9.52) 




by exchanging the order of the integrations, as shown in figure 9.5. 



a) 



P 




b) 




Figure 9.5: a) In (9.52) we integrate first over r and then over p. The inner r 
integral is therefore from r = p to r = oo . b) In (9.53) we integrate first over 
p and then over r. The inner p integral therefore runs from p = to p = r. 



After the interchange we have 

i = --r\f , 

^ Jo [Jo ^' 

Since 




iy 2 'p^' 



1 



a/ 7 " 2 -P 2 



dp 



7T 

2' 



(9.53) 



(9.54) 
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the inner integral is independent of r. We thus obtain 

i= ~r (^ /(r) ) dr = /(o) = /( °' o) ' (9,55) 

Radon's inversion formula is therefore correct. 

Although Radon found a closed-form inversion formula, the numerical 
problem of reconstructing the image from the partial and noisy data obtained 
from a practical CAT scanner is quite delicate, and remains an active area 
of research. 



9.4 Separable Kernels 

Let 

N 

K(x,y)=J2Pi( x )<li(y), ( 9 - 56 ) 
i=i 

where {p^ and {qi} are two linearly independent sets of functions. The 
range of K is therefore the span (p^ of the set {pi}- Such kernels are said 
to be separable. The theory of integral equations containing such kernels is 
especially transparent. 



9.4.1 Eigenvalue problem 

Consider the eigenvalue problem 



Xu(x) = / K(x,y)u(y) dy (9.57) 
Jd 

for a separable kernel. Here, D is some range of integration, and x G D. If 
A 7^ 0, we know that u has to be in the range of K, so we can write 



u 



Inserting this into the integral, we find that our problem reduces to the finite 
matrix eigenvalue equation 

Ki = (9-59) 
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where 

Aij= I qi(y)Pj(y)dy. (9.60) 

J D 

Matters are especially simple when = p*. In this case A {j = A* { , so the 
matrix A is Hermitian and has N linearly independent eigenvectors. Further, 
none of the N associated eigenvalues can be zero. To see that this is so 
suppose that v(x) = YliCiVi^) is an eigenvector with zero eigenvalue. In 
other words, suppose that 

o = X>(*) / pKy)p 3 (y)Qdy. (9.61) 

. Jd 

Since the Pi(x) are linearly independent, we must have 

0= / P*(y)Pj(y)Qdy = 0, (9.62) 

for each % separately. Multiplying by Q and summing we find 

2 

0= / ^Pj(y)Q dy= f \v(y)\ 2 dy, (9.63) 
Jd j Jd 

and so v(x) itself must have been zero. The remaining (infinite in number) 
eigenfunctions span (qi) ± and have A = 0. 

9.4.2 Inhomogeneous problem 

It is easiest to discuss inhomogeneous separable- kernel problems by example. 
Consider the equation 

u(x) = f(x) +fi f K(x, y)u{y) dy, (9.64) 
Jo 

where K(x,y) = xy. Here, f(x) and /j, are given, and u(x) is to be found. 
We know that u(x) must be of the form 

u(x) = f(x) + ax, (9.65) 

and the only task is to find the constant a. We plug u into the integral 
equation and, after cancelling a common factor of x, we find 

a = /i yu(y)dy = fi yf(y)dy + afi y 2 dy. (9.66) 
Jo Jo Jo 
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The last integral is equal to fia/3, so 



a 




(9.67) 



and finally 



u(x) = f(x) + X 



(1 - /x/3) 




(9.68) 



Notice that this solution is meaningless if /j, — 3. We can relate this to the 
eigenvalues of the kernel K(x, y) = xy. The eigenvalue problem for this 
kernel is 



J o 

On substituting u(x) = ax, this reduces to \ax = ax/3, and so A = 1/3. All 
other eigenvalues are zero. Our inhomogeneous equation was of the form 



and the operator (1— /iK) has an infinite set of eigenf unctions with eigenvalue 
1, and a single eigenfunction, u (x) = x, with eigenvalue (1 — /i/3). The 
eigenvalue becomes zero, and hence the inverse ceases to exist, when \i = 3. 

A solution to the problem (1 — /iK)u = f may still exist even when \i = 3. 
But now, applying the Fredholm alternative, we see that / must satisfy the 
condition that it be orthogonal to all solutions of (1 — jiK)^v = 0. Since our 
kernel is Hermitian, this means that / must be orthogonal to the zero mode 
Uq(x) = x. For the case of fi = 3, the equation is 




(9.69) 



(l-fiK)u = f 



(9.70) 




(9.71) 



and to have a solution / must obey yf(y) dy = 0. We again set u 
f(x) + ax, and find 




(9.72) 



but now this reduces to a = a. The general solution is therefore 



u = f(x) + ax 



(9.73) 



with a arbitrary. 
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9.5 Singular Integral Equations 

Equations involving principal-part integrals, such as the airfoil equation 

- f\(x)—dx = f(y), (9.74) 
vr /_! x-y 

in which / is given and we are to find ip, are called singular integral equations. 
Their solution depends on what conditions are imposed on the unknown 
function <f(x) at the endpoints of the integration region. We will consider 
only this simplest example here? 



9.5.1 Solution via Tchebychef Polynomials 

Recall the definition of the Tchebychef polynomials from chapter 2. We set 

T n (x) = cos(ncos _1 :r), (9.75) 

/ \ sinfncos -1 x) 1 , . . , . 

U n -i(x) = — \ —L = -T^x). 9.76 

sm(cos n 

These are the Tchebychef Polynomials of the first and second kind, respec- 
tively. The orthogonality of the functions cos n6 and sin n6 over the interval 
[0, 7r] translates into 

T n (x) T m (x) dx = h n 5 nm , n, m > 0, (9.77) 



i VT 



x 



2 



where h = n, h n = n/2, n > 0, and 

j y/l - x 2 C/ n _i(x) U m -i(x) dx = - 5 nm , n,m>0. (9.78) 

The sets {T n (x)} and {U n (x)} are complete in I/^[0, 1] with the weight func- 
tions w = (1 — x 1 )^ 1 ! 2 and w = (1 — re 2 ) 1 / 2 , respectively . 

Rather less obvious are the principal-part integral identities (valid for 
-Ky<l) 

P [ - 1 —^—dx = 0, (9.79) 

pf J— T n (x)-!—dx = 7iU n ^(y), n>0, (9.80) 
J-iVl-x 2 x-y 



2 The classic text is N. I. Muskhelishvili Singular Integral Equations. 
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and 

r 1 1 

P Vl -x 2 U n _ 1 (x) dx = -TcTJy), n > 0. (9.81) 

J-i x-y 

These correspond, after we set x = cos 9 and y = cos 0, to the trigonometric 
integrals 

P T ™ Snd d9 = n S ^4, (9.82) 
J cos 9 — cos sin <p 

and 



f 

Jo 



sin # sin n9 in , _ oo . 

dv = —n cos ncp, (9.83) 



cos 9 — cos 

respectively. We will motivate and derive these formulas at the end of this 
section. 

Granted the validity of these principal-part integrals we can solve the 
integral equation 



p r 1 
k J-i 



<p(x)-^dx = f(y), ye [-1,1], (9.84) 



for ip in terms of /, subject to the condition that ip be bounded at x — ±1. 
We show that no solution exists unless / satisfies the condition 

r -j=L=f{x) dx = 0, (9.85) 
J-i Vl - x 2 

but if / does satisfy this condition then there is a unique solution 

tpty) = -y^lp f —L=f(x)—dx. (9.86) 
vr J-i Vl - x ~y 

To understand why this is the solution, and why there is a condition on /, 
expand 

oo 

f(x) = Y,b n T n (x). (9.87) 

n=l 

Here, the condition on / translates into the absence of a term involving 
T = 1 in the expansion. Then, 

oo 

if{x) = -Vl-X 2 J2 b nUn-l(x), (9.88) 
n=l 
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with b n the coefficients that appear in the expansion of /, solves the problem. 
That this is so may be seen on substituting this expansion for tp into the 
integral equation and using second of the principal-part identities. This 
identity provides no way to generate a term with T ; hence the constraint. 
Next we observe that the expansion for if is generated term-by-term from 
the expansion for / by substituting this into the integral form of the solution 
and using the first principal-part identity. 
Similarly, we solve the for ip(y) in 

-/ ( p( x )^—dx = f(y), ye [-1,1], (9.89) 

7T J-! x - y 

where now is permitted to be singular at x — ±1. In this case there is 
always a solution, but it is not unique. The solutions are 



<p{y) = j y/T^f( x )-J—dx+-^=, (9.90) 



C 

ny/1 -y 2 ^ J-i V ~ ~ Jy ~'x-y""~ y/l-y 
where C is an arbitrary constant. To see this, expand 



f(x) = Y,a n U n . 1 (x), (9.91) 



71=1 



and then 



ip{x) = a nT n {x) + CT j , (9.92) 

satisfies the equation for any value of the constant C. Again the expansion 
for <f is generated from that of / by use of the second principal-part identity. 



Explanation of the Principal-Part Identities 

The principal-part identities can be extracted from the analytic properties of 
the resolvent operator R\(n — n') = (H — A/)~^, for a tight-binding model 
of the conduction band in a one-dimensional crystal with nearest neighbour 
hopping. The eigenf unctions ue{u) for this problem obey 

u E (n + l)+u E (n -1) = E u E (n) (9.93) 



and are 



u E [n) = e me , -7T < 6 < ir, 



(9.94) 
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with energy eigenvalues E = 2 cos 6? . 
The resolvent R\{n) obeys 

R x (n + I) + R x (n - I) - \R x (n) = 5 n0 , n E Z, (9.95) 

and can be expanded in terms of the energy eigenfunctions as 

_ E U ,(W) = r *•"■-'>* * im 

v ; ^ E-X J_ 7T 2cos6-\ 2tt v ; 



If we set A = 2 cos (ft, we observe that 

e ine d6 



I 



n 2 cos 9 — 2 cos 2-7T 2i sin 



r^— e i|n|< ^, Im0>O. (9.97) 



That this integral is correct can be confirmed by observing that it is evalu- 
ating the Fourier coefficient of the double geometric series 

y e -**jW = Im0>O. (9.98) 

2cos#-2cos0' Y v ; 

n=— oo 

By writing e md = cos n6 + i sin nO and observing that the sine term integrates 
to zero, we find that 

f n COS TlO 7T 

/ -d6 — -(cosn0 + i sinn0), (9.99) 

J cos — cos (ft i sin (ft 

where n > 0, and again we have taken Im0 > 0. Now let (ft approach the 
real axis from above, and apply the Plemelj formula. We find 



/»7T 

7. 



cos n9 ,„ sin nci) 

^ = 7T^-f-. (9.100) 



cos 9 — cos (ft sin 

This is the first principal-part integral identity. The second identity, 

„ r sin 6 sin 776* ,„ , ,„*^s 

P ^=-7TCOS770, (9.101) 

J cost' — COS 

is obtained from the the first by using the addition theorems for the sine and 
cosine. 
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9.6 Wiener-Hopf equations I 

We have seen that Volterra equations of the form 

/ K(x-y)u(y)dy = f(x), < x < oo, (9.102) 
Jo 

having translation invariant kernels, may be solved for u by using a Laplace 
transform. The apparently innocent modification 

poo 

I K(x-y)u(y)dy = f(x), < x < oo (9.103) 
Jo 

leads to an equation that is much harder to deal with. In these Wiener- 
Hopf equations, we are still only interested in the upper left quadrant of the 
continuous matrix K(x — y) 



x 



K 


1 


// 






•+ 

y 
















Figure 9.6: The matrix form of (9.103). 

and K(x — y) still has entries depending only on their distance from the 
main diagonal. Now, however, we make use of the values of K(x) for all of 
— oo < x < oo. This suggests the use of a Fourier transform. The problem is 
that, in order to Fourier transform, we must integrate over the entire real line 
on both sides of the equation and this requires us to to know the values of 
fix) for negative values of a; — but we have not been given this information 
(and do not really need it). We therefore make the replacement 

f(x)^f(x)+g(x), (9.104) 

where f(x) is non-zero only for positive x, and g(x) non-zero only for negative 
x. We then solve 

I 00 K(x-y)u(y)dy={ f ( X }> ° < X < 00 ' (9.105) 
Jo \9(x), -oo<x<0, K 1 
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so as to find u and g simultaneously. In other words, we extend the problem 
to one on the whole real line, but with the negative-x source term g[x) chosen 
so that the solution u(x) is non-zero only for positive x. We represent this 
pictorially in figure 9.7. 



x 




Figure 9.7: The matrix form of (9.105) with both f and g 

To find u and g we try to make an "LU" decomposition of the matrix K into 
the product K = L~ 1 U of an upper triangular matrix U(x — y) and a lower 
triangular matrix L _1 (:r — y). Written out in full, the product L~ 1 U is 

/oo 
L- l (x-t)U{t-y)dt. (9.106) 
■oo 

Now the inverse of a lower triangular matrix is also lower triangular, and so 
L(x — y) itself is lower triangular. This means that the function U (x) is zero 
for negative x, whilst L(x) is zero when x is positive. 




Figure 9.8: The matrix decomposition K = L 1 U. 



If we can find such a decomposition, then on multiplying both sides by L, 
equation (9.103) becomes 

/ U(x-y)u(y)dy = h(x), < x < oo, (9.107) 
Jo 
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where 

POO 

h(x)= L(x-y)f(y)dy, < x < oo. (9.108) 

J X 

These two equations come from the upper half of the full matrix equation 
represented in figure 9.9. 



o V 






V o " 


/l 
















g 








Figure 9.9: The equation (9.107) and the definition (9.108) correspond to 
the upper half of these two matrix equations. 



The lower parts of the matrix equation have no influence on (9.107) and 
(9.108): The function h(x) depends only on /, and while g(x) should be 
chosen to give the column of zeros below h, we do not, in principle, need to 
know it. This is because we could solve the Volterra equation Uu = h (9.107) 
via a Laplace transform. In practice (as we will see) it is easier to find g(x), 
and then, knowing the (f,g) column vector, obtain u(x) by solving (9.105). 
This we can do by Fourier transform. 

The difficulty lies in finding the LU decomposition. For finite matrices 
this decomposition is a standard technique in numerical linear algebra. It 
equivalent to the method of Gaussian elimination, which, although we were 
probably never told its name, is the strategy taught in high school for solving 
simultaneous equations. For continuously infinite matrices, however, making 
such a decomposition demands techniques far beyond those learned in school. 
It is a particular case of the scalar Riemann-Hilbert problem, and its solution 
requires the use of complex variable methods. 

On taking the Fourier transform of (9.106) we see that we are being asked 
to factorize 

K(k) = [L(k)Y l U(k) (9.109) 

where 

POO 

U(k) = / e ikx U(x) dx (9.110) 
Jo 
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is analytic (i.e. has no poles or other singularities) in the region Imk > 0, 
and similarly 

r° 

L(k) = / e lkx L(x)dx (9.111) 

J — oo 

has no poles for Im/c < 0, these analyticity conditions being consequences 
of the vanishing conditions U(x — y) = 0, x < y and L[x — y) = 0, x > y. 
There will be more than one way of factoring K into functions with these 
no-pole properties, but, because the inverse of an upper or lower triangu- 
lar matrix is also upper or lower triangular, the matrices U~ 1 (x — y) and 
L^ 1 (x — y) have the same vanishing properties, and, because these inverse 
matrices correspond the to the reciprocals of the Fourier transform, we must 
also demand that U{k) and L(k) have no zeros in the upper and lower half 
plane respectively. The combined no-poles, no-zeros conditions willjusually 
determine the factors up to constants. If we are able to factorize K(k) in 
this manner, we have effected the LU decomposition. When K(k) is a ratio- 
nal function of k we can factorize by inspection. In the general case, more 
sophistication is required. 
Example: Let us solve the equation 

PCX) 

u(x)-X e- lx - yl u(x) dx = f(x), (9.112) 
Jo 

where we will assume that A < 1/2. Here the kernel function is 

K(x,y)=5(x-y)-Xe~\ x ~ y K (9.113) 
This has Fourier transform 

~ m , 2A A; 2 + (1-2A) / k + ia\ ( k - i V 1 , n „ AS 

where a 2 = 1 — 2A. We were able to factorize this by inspection with 

k + ia ~ L{k)= k-j_^ 
v ' k + i v ' k-ia v ' 



having poles and zeros only in the lower (respectively upper) half-plane. We 
could now transform back into x space to find U(x — y), L(x — y) and solve 
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the Volterra equation Uu = h. It is, however, less effort to work directly 
with the Fourier transformed equation in the form 

Here we have placed subscripts on f(k), g(k) and u(k) to remind us that these 
Fourier transforms are analytic in the upper (+) or lower (-) half-plane. Since 
the left-hand-side of this equation is analytic in the upper half-plane, so must 
be the right-hand-side. We therefore choose g~{k) to eliminate the potential 
pole at k = ia that might arise from the first term on the right. This we can 
do by setting 

V;/.-) 7^— (9.H7) 



k — ia J k — ia 

for some as yet undetermined constant a. (Observe that the resultant g~(k) 
is indeed analytic in the lower half-plane. This analyticity ensures that g(x) 
is zero for positive x.) We can now solve for u(k) as 



k + ia J \k — ia J \k + ia J k — ia 

k 2 + l 7 , M , k + i 
WTa^ U{k} + a WTa^ 

Mk) + ^f + (k) + a^±y (9.118) 



The inverse Fourier transform of 



k + i 



(9.119) 



is 



and that of 



is 



k 2 + a 2 

1- |a|sgn(x))e- |a|W , (9.120) 



2|a 

1 - a 2 2A 



k 2 + a 2 k 2 + (l- 2A) 



(9.121) 



A e-yfr^W. (9.122) 



VI -2A 
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Consequently 

A f°° 

+/3{1 - v / T^2Asgna;)e- v ™ |:r| . (9.123) 

Here f3 is some multiple of a, and we have used the fact that f(y) is zero 
for negative y to make the lower limit on the integral instead of — oo. We 
determine the as yet unknown (3 from the requirement that u(x) = for 
x < 0. We find that this will be the case if we take 

A f°° 

P = -^—TS / e- a yf(y)dy. (9.124) 
a(a + 1) J 

The solution is therefore, for x > 0, 

A(V1 - 2\-V ^ e _ VI ^- Xx r e -VT^Xy f{)d (9125) 

i_2A + v / T^2A y J\yj y \ > 

Not every invertible n-by-n matrix has a plain LU decomposition. For a 
related reason not every Wiener-Hopf equation can be solved so simply. In- 
stead there is a topological index theorem that determines whether solutions 
can exist, and, if solutions do exist, whether they are unique. We shall there- 
fore return to this problem once we have aquired a deeper understanding of 
the interaction between topology and complex analysis. 



9.7 Some Functional Analysis 

We have hitherto avoided, as far as it is possible, the full rigours of mathe- 
matics. For most of us, and for most of the time, we can solve our physics 
problems by using calculus rather than analysis. It is worth, nonetheless, be- 
ing familiar with the proper mathematical language so that when something 
tricky comes up we know where to look for help. The modern setting for 
the mathematical study of integral and differential equations is the discipline 
of functional analysis, and the classic text for the mathematically inclined 
physicist is the four- volume set Methods of Modern Mathematical Physics by 
Michael Reed and Barry Simon. We cannot summarize these volumes in few 
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paragraphs, but we can try to provide enough background for us to be able 
to explain a few issues that may have puzzled the alert reader. 

This section requires the reader to have sufficient background in real 
analysis to know what it means for a set to be compact. 

9.7.1 Bounded and Compact Operators 

i) A linear operator K : L 2 — > L 2 is bounded if there is a positive number 
M such that 

\\Kx\\ < M\\x\\, Viei 2 . (9.126) 

If K is bounded then smallest such M is the norm of K, which we 
denote by \\K\\ . Thus 

\\Kx\\ < \\K\\ \\x\\. (9.127) 

For a finite-dimensional matrix, ||i^|| is the largest eigenvalue of K. 
The function Kx is a continuous function of x if, and only if, it is 
bounded. "Bounded" and "continuous" are therefore synonyms. Linear 
differential operators are never bounded, and this is the source of most 
of the complications in their theory. 

ii) If the operators A and B are bounded, then so is AB and 

\\AB\\<\\A\\\\B\\. (9.128) 

iii) A linear operator K : L 2 — > L 2 is compact (or completely continuous) 
if it maps bounded sets in L 2 to relatively compact sets (sets whose 
closure is compact). Equivalently, K is compact if the image sequence 
Kx n of every bounded sequence of functions x n contains a convergent 
subsequence. Compact =>- continuous, but not vice versa. One can 
show that, given any positive number M, a compact self-adjoint oper- 
ator has only a finite number of eigenvalues with A outside the interval 
[— M, M\. The eigenvectors u n with non-zero eigenvalues span the range 
of the operator. Any vector can therefore be written 

u = uq + ajMj, (9.129) 

i 

where Uq lies in the null space of K. The Green function of a linear 
differential operator defined on a finite interval is usually the integral 
kernel of a compact operator. 
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iv) If K is compact then 

H = I + K (9.130) 

is Fredholm. This means that H has a finite dimensional kernel and 
co-kernel, and that the Fredholm alternative applies. 

v) An integral kernel is Hilbert- Schmidt if 



/ \m,ri)\' 



d£dr) < oo. 



(9.131) 



This means that K can be expanded in terms of a complete orthonormal 
set {4> m } as 



K(x,y)= ^ A nm(f>n(x)<i>*m(y) 



(9.132) 



n,m=l 



in the sense that 



lim 

N,M-^oo 



N,M 

Y A nm<f>n<f>* m ~ K 
n,m=l 



Now the finite sum 



N,M 

5^ A nm (f) n (x)(f)* m (y) 

n,m=l 



(9.133) 



(9.134) 



is automatically compact since it is bounded and has finite-dimensional 
range. (The unit ball in a Hilbert space is relatively compact the 
space is finite dimensional). Thus, Hilbert-Schmidt implies that K is 
approximated in norm by compact operators. But it is not hard to 
show that a norm-convergent limit of compact operators is compact, 
so K itself is compact. Thus 



Hilbert-Schmidt compact. 



It is easy to test a given kernel to see if it is Hilbert-Schmidt (simply 
use the definition) and therein lies the utility of the concept. 
If we have a Hilbert-Schmidt Green function g, we can reacast our differen- 
tial equation as an integral equation with g as kernel, and this is why the 
Fredholm alternative works for a large class of linear differential equations. 
Example: Consider the Legendre-equation operator 



d_ 

dx 



x 2 ) 



d_ 

dx 



(9.135) 
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acting on functions u G L 2 [— 1, 1] with boundary conditions that u be finite 
at the endpoints. This operator has a normalized zero mode uq — 1/ v^2, so 
it cannot have an inverse. There exists, however, a modified Green function 
g(x,x') that satisfies 

Lu = 5{x-x')-)-. (9.136) 

It is 

g(x, x') = ln2 - 1 - ± ln(l + x>)(l - x<), (9.137) 
where x> is the greater of x and x', and :r< the lesser. We may verify that 



1 rl 




\g(x,x')\ 2 dxdx' < oo, (9.138) 



so g is Hilbert-Schmidt and therefore the kernel of a compact operator. The 
eigenvalue problem 

Lu n = \ n u n (9.139) 
can be recast as as the integral equation 

li n u n = J g(x,x')u n (x')dx' (9.140) 

with /i n = A^ 1 . The compactness of g guarantees that there is a complete 
set of eigenf unctions (these being the Legendre polynomials P n (x) for n > 0) 
having eigenvalues fi n = l/n(n+l). The operator g also has the eigenfunction 
P with eigenvalue /i — 0. This example provides the justification for the 
claim that the "finite" boundary conditions we adopted for the Legendre 
equation in chapter 8 give us a self adjoint operator. 

Note that K(x,y) does not have to be bounded for K to be Hilbert- 
Schmidt. 

Example: The kernel 

K(x,y)= - 1 - , \x\,\y\<l (9.141) 



(x - y) a ' 



is Hilbert-Schmidt provided a < \. 
Example: The kernel 



K(x, y) = -}- e - m \ x ~ y \ x, y E R (9.142) 
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is not Hilbert-Schmidt because \K(x — y)\ is constant along the the lines 
x — y = constant, which lie parallel to the diagonal. K has a continuous 
spectrum consisting of all positive real numbers less than 1/m 2 . It cannot 
be compact, therefore, but it is bounded with \\K\\ = 1/m 2 . The integral 
equation (9.22) contains this kernel, and the Fredholm alternative does not 
apply to it. 

9.7.2 Closed Operators 

One motivation for our including a brief account of functional analysis is 
that an attentive reader will have realized that some of the statements we 
have made in earlier chapters appear to be inconsistent. We have asserted in 
chapter 2 that no significance can be attached to the value of an L 2 function 
at any particular point — only integrated averages matter. In later chapters, 
though, we have happily imposed boundary conditions that require these 
very functions to take specified values at the endpoints of our interval. In this 
section we will resolve this paradox. The apparent contradiction is intimately 
connected with our imposing boundary conditions only on derivatives of lower 
order than than that of the differential equation, but understanding why this 
is so requires some function-analytic language. 

Differential operators L are never continuous; we cannot deduce from 
u n — > u that Lu n — > Lu. Differential operators can be closed, however. A 
closed operator is one for which whenever a sequence u n converges to a limit 
u and at the same time the image sequence Lu n also converges to a limit /, 
then u is in the domain of L and Lu = f. The name is not meant to imply 
that the domain of definition is closed, but indicates instead that the graph 
of L — this being the set {u, Lu} considered as a subset of L 2 [a, b] x L 2 [a, b] 
— contains its limit points and so is a closed set. 

Any self-adjoint operator is automatically closed. To see why this is so, 
recall that in the defining the adjoint of an operator A, we say that y is in the 
domain of A^ if there is a z such that (y, Ax) = (z, x) for all x in the domain 
of A. We then set A^y = z. Now suppose that y n — > y and A*y n = z n — > z. 
The Cauchy-Schwartz-Bunyakovski inequality shows that the inner product 
is a continuous function of its arguments. Consequently, if x is in the domain 
of A, we can take the limit of (y n ,Ax) = {A^y n ,x) = (z n ,x) to deduce that 
(y, Ax) = (z, x). But this means that y is in the domain of A\ and z = A^y. 
The adjoint of any operator is therefore a closed operator. A self-adjoint 
operator, being its own adjoint, is therefore necessarily closed. 



9.7. SOME FUNCTIONAL ANALYSIS 



375 



A deep result states that a closed operator defined on a closed domain is 
bounded. Since they are always unbounded, the domain of a closed differen- 
tial operator can never be a closed set. 

An operator may not be closed but may be closable, in that we can 
make it closed by including additional functions in its domain. The essential 
requirement for closability is that we never have two sequences u n and v n 
which converge to the same limit, w, while Lu n and Lv n both converge, but 
to different limits. Closability is equivalent to requiring that if u n — > and 
Lu n converges, then Lu n converges to zero. 

Example: Let L = d/dx. Suppose that u n — > and Lu n — > /. If (p is a 
smooth L 2 function that vanishes at 0,1, then 



Here we have used the continuity of the inner product to justify the inter- 
change the order of limit and integral. By the same arguments we used when 
dealing with the calculus of variations, we deduce that / = 0. Thus d/dx is 
closable. 

If an operator is closable, we may as well add the extra functions to its 
domain and make it closed. Let us consider what closure means for the 
operator 



Here, in fixing the derivative at the endpoint, we are imposing a boundary 
condition of higher order than we ought. 

Consider the sequence of differentiable functions y a shown in figure 9.10. 
These functions have vanishing derivative at x — 0, but tend in L 2 to a 
function y whose derivative is non-zero at x — 0. 




(9.143) 




V(L) = {yeC 1 [0,l]:y'(0) = 0}. 



(9.144) 
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a 



Figure 9.10: \im a ^ y a = y in L 2 [0, 1] . 
Figure 9.11 shows that derivative of these functions also converges in L 2 . 




a 



Figure 9.11: y' a -> y' in L 2 [0, 1] . 

If we want L to be closed, we should therefore extend the domain of definition 
of L to include functions with non- vanishing endpoint derivative. We can also 
use this method to add to the domain of L functions that are only piecewise 
differentiable — i.e. functions with a discontinuous derivative. 
Now consider what happens if we try to extend the domain of 

L = A V{L) = {y, y'eL 2 : y(0) = 0}, (9.145) 

to include functions that do not vanish at the endpoint. Take a sequence the 
sequence of functions y a shown in figure 9.12. These functions vanish at the 
origin, and converge in L 2 to a function that does not vanish at the origin. 
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a 



Figure 9.12: \im a ^ y a = V in L 2 [0, 1]. 

Now, as figure 9.13 shows, the derivatives converge towards the derivative 
of the limit function — together with a delta function near the origin. The 
area under the functions |j/a(x)| 2 grows without bound and the sequence Ly a 
becomes infinitely far from the derivative of the limit function when distance 
is measured in the L 2 norm. 



8(x) 




a 

Figure 9.13: y' a — > S(x), but the delta function is not an element of L 2 [0, 1] . 

We therefore cannot use closure to extend the domain to include these func- 
tions. Another way of saying this is, that in order for the weak derivative of 
y to be in L 2 , and therefore for y to be in the domain of d/dx, the function y 
need not be classically differentiable, but its L 2 equivalence class must con- 
tain a continuous function — and continuous functions do have well-defined 
values. It is the values of this continuous representative that are constrained 
by the boundary conditions. 

This story repeats for differential operators of any order: If we try to 
impose boundary conditions of too high an order, they are washed out in the 
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process of closing the operator. Boundary conditions of lower order cannot 
be eliminated, however, and so make sense as statements involving functions 
in L 2 . 

9.8 Series Solutions 

One of the advantages of recasting a problem as an integral equation, is that 
the equation often suggests a systematic approximation scheme. Usually we 
start from the solution of an exactly solvable problem and expand the desired 
solution about it as an infinite series in some small parameter. The terms in 
such a perturbation series may become progressively harder to evaluate, but, 
if we are lucky, the sum of the first few will prove adaquate for our purposes. 

9.8.1 Liouville-Neumann-Born Series 

The geometric series 

S=l-x + x 2 -x 3 + --- (9.146) 
converges to 1/(1 + x) provided \x\ < 1. Suppose we wish to solve 

(I + \K)<p = f 

where K is a an integral operator. It is then natural to write 

up = (/ + AX)" 1 / = (1 - XK + \ 2 K 2 - A 3 ^ 3 + •••)/ (9.148) 

where 

K 2 (x, y) = J K(x, z)K(z, y) dz, K 3 (x, y) = J K(x, z 1 )K(z 1 , z 2 )K(z 2 , y) dz 1 dz 2 , 

(9.149) 

and so on. This Liouville- Neumann series will converge, and yield a solution 
to the problem, provided that A||i^|| < 1. In quantum mechanics this series 
is known as the Born series. 

9.8.2 Fredholm Series 

A familiar result from high-school algebra is Cramer's rule, which gives the 
solution of a set of linear equations in terms of ratios of determinants. For 



(9.147) 
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example, the system of equations 

auxi + a 12 x 2 + a 13 x 3 = b 1: 
a 21 x 1 + a 2 2X 2 + a 23 x 3 = b 2 , 
a 31 Xx + a 32 x 2 + a 33 x 3 = b 3 , 

has solution 



(9.150) 
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bi 


an 
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1 


a u 
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012 
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a 22 


023 


' X2= D 


a 2 i 
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h 
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b 3 


033 




031 


032 


h 



where 



D = 



on 
021 
a 3 i 



012 
022 
032 



Ol3 

023 
033 



(9.151) 
(9.152) 



Although not as computationally efficient as standard Gaussian elimination, 
Cramer's rule is useful in that it is a closed-form solution. It is equivalent to 
the statement that the inverse of a matrix is given by the transposed matrix 
of the co-factors, divided by the determinant. 

A similar formula for integral equations was given by Fredholm. The 
equations he considered were, in operator form 



(/ + \K) V = f. 



(9.153) 



Where I is the identity operator, K is an integral operator with kernel 
K(x,y), and A a parameter We motivate Fredholm's formula by giving an 
expansion for the determinant of a finite matrix. Let K be an n-by-n matrix 



D(X) = det (I + AK) = 



1 + XK U 
XK 2l 

\K nl 



\K 12 
1 + \K 22 



XK, 



n2 



\K 2n 
1 + \K n 



Then 



where A 



____ \ fit 



(9.154) 



(9.155) 



m=0 



U,«2 = l 



K. 
K, 



trK = E^n, 



12*1 



«2«2 



= E 

11,12,13 = 1 



K„ 



«3*1 



K 

K i2i2 



U«3 



«3*2 



K. 



«3*3 



(9.156) 
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The pattern for the rest of the terms should be obvious, as should the proof. 

As observed above, the inverse of a matrix is the reciprocal of the deter- 
minant of the matrix multiplied by the transposed matrix of the co-factors. 
So, if D^ u is the co-factor of the term in D(X) associated with K Vfl , then the 
solution of the matrix equation 



is 



X, 



(1 + AK)x = b 

DjJh + D^ 2 b 2 H V Dynbn 

D(X) 



(9.157) 
(9.158) 



If fi 7^ v we have 

i 

When jj, = v we have 



K K 

1 1 jlV 1 1 III 

K K 



1\12 



K K 
K K 
K K 



K 



M*2 



U«2 



«2*2 



(9.159) 
(9.160) 



where -D(A) is the expression analogous to -D(A), but with the /z'th row and 
column deleted. 

These elementary results suggest the definition of the Fredholm determi- 
nant of the integral kernel K(x, y), a < x, y < b, as 



A' 



D(X) = Det 17 + XK\ = —A m , 

*-^ n ml 



(9.161) 



m=0 

where A = 1, A x = Tr X = j*K{x,x) dx, 

K"(xi,x 2 ) 
if(x 2 ,xi) K(x 2 ,x 2 ) 



dx±dx2, 



b r b pb 




A 3 - 

etc. We also define 
£>(x,y,A) = AX(x,y) + A 2 



K(x 1 ,x 1 ) K(x 1 ,x 2 ) K(x!,x 3 ) 
K{x 2 ,x 1 ) K(x 2 ,x 2 ) K(x 2 ,x 3 ) 
K(x 3 ,xi) K(x 3 ,x 2 ) K(x 3 ,x 3 ) 



dx\dx 2 dx 3 . (9.162) 



" K(x,y) K(x,£) 
K(t,y) K&t) 



d£ 
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,1 [ b [ b 

2! JaJa 



K(x,y) K(x,^) K(x,&) 
K(d,y) Kfah) Kfah) 
Kfa,y) ^(6,6) 



and then 



tp(x) = f(x) + 



D(X) 



D(x,y,X)f(y)dy 



d£id£ 2 H , 

(9.163) 

(9.164) 



is the solution of the equation 



cp(x)+\[ K(x,y)<p(y)dy = f(x). (9.165) 

J a 

If \K(x, y)\ < M in [a, b] x [a, 6], the Fredholm series for -D(A) and y, A) 
converge for all A, and define entire functions. In this feature it is unlike the 
Neumann series, which has a finite radius of convergence. 
The proof of these claims follows from the identity 



D(x, y, A) + XD(X)K(x, y) + A f D(x, £, \)K(£, y) ^ = 0, 

J a 

or, more compactly with G(x,y) = D(x,y, X)/D(X), 

(I + G)(I + XK)=I. 

For details see Whitaker and Watson §11.2. 
Example: The equation 



(9.166) 



(9.167) 



gives us 



and so 



ip{x) = x + A / xyip(y) dy 
Jo 



D(\) = l-h, D(x,y,\) = \xy 



(p(x) = 



3x 
3- A' 



(9.168) 

(9.169) 
(9.170) 



(We have considered this equation and solution before, in section 9.4 
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9.9 Further Exercises and Problems 

Exercise 9. 1 : The following problems should be relatively easy. 

a) Solve the inhomogeneous type II Fredholm integral equation 

u(x) = e x + A / xyu(y)dy. 
Jo 

b) Solve the homogeneous type II Fredholm integral equation 



u(x) = A / sin(x — y) u(y) dy . 
Jo 

c) Solve the integral equation 

u(x) = x + A / (yx + y 2 ) u{y) dy 
Jo 

to second order in A using 

(i) the Neumann series; and 

(ii) the Fredholm series. 

d) By differentiating, solve the integral equation: u(x) = x + u(y) dy. 

e) Solve the integral equation: u(x) = x 2 + Jq xyu{y) dy. 

f) Find the eigenfunction(s) and eigenvalue(s) of the integral equation 



u(x) = A / e x y u(y) dy . 
Jo 



g) Solve the integral equation: u(x) = e x + A Jq e x v u(y) dy. 

h) Solve the integral equation 

u(x) = x + / dy (1 + xy) u(y) 
Jo 

for the unknown function u(x). 
Exercise 9.2: Solve the integral equation 

u(x) = f(x) + A f x 3 y 3 u(y)dy, < x < 1 
Jo 

for the unknown u{x) in terms of the given function f(x). For what values 
of A does a unique solution u(x) exist without restrictions on f(x)l For what 
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value A = Ao does a solution exist only if f(x) satisfies some condition? Using 
the language of the Fredholm alternative, and the range and nullspace of the 
relevant operators, explain what is happening when A = Ao- For the case 
A = Ao find explicitly the condition on f{x) and, assuming this condition is 
satisfied, write down the corresponding general solution for u(x). Check that 
this solution does indeed satisfy the integral equation. 

Exercise 9.3: Use a Laplace transform to find the solution to the generalized 
Abel equation 

rx 

f(x)= (x - t)~>* u(t)dt, 0<n<l, 
Jo 

where f(x) is given and f(0) = 0. Your solution will be of the form 

[•X 

u(x) = / K(x - t)f'(t)dt, 
Jo 

and you should give an explicit expression for the kernel K(x — t). 
You will find the formula 

poo 

/ t fl - 1 e- pt dt = p- fl T(p), fi>0. 
Jo 

to be useful. 



Exercise 9.4: Translationally invariant kernels: 

a) Consider the integral equation: u(x) = g{x) + A K{x, y) u(y) dy, 
with the translationally invariant kernel K(x, y) = Q(x — y), in which g, 
A and Q are known. Show that the Fourier transforms it, g and Q satisfy 
u(q) = g(q)/{l — \^2Tr\Q(q)}. Expand this result to second order in A 
to recover the second-order Liouville-Neumann-Born series. 

b) Use Fourier transforms to find a solution of the integral equation 

/oo 
e~\ x - y \ u(y) dy 
-oo 

that remains finite as |x| — > oo. 

c) Use Laplace transforms to find a solution of the integral equation 

PX 

u{x) = e~ x + A / e~\' x ~ y \ u(y) dy x > 0. 
Jo 
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Exercise 9.5: The integral equation 



-I 

7T Jo 



dy^- = f(x), x>0, 

o % + y 



relates the unknown function <fi to the known function /. 

(i) Show that the changes of variables 

x = exp2£, y = exp2r/, 0(exp2r/) expr? = tp(rj), /(exp2£) exp£ = g(£), 

converts the integral equation into one that can be solved by an integral 
transform. 

(ii) Hence, or otherwise, construct an explicit formula for <p{x) in terms of a 
double integral involving f(y). 

You may use without proof the integral 

oo e -ist; ^ n 

.oo cosh^ cosh7rs/2' 

Exercise 9.6: Using Mellin transforms. Recall that the Mellin transform f(s) 
of the function f(t) is defined by 

r»oo 



____ /<oo 

f( S )= / dtt^fit). 
JO 



10 

a) Given two functions, f(t) and g(t), a Mellin convolution f * g can be 
defined through 

(f*g)(t)= / f{tu- l )g{u)- 
Jo u 

Show that the Mellin transform of the Mellin convolution / * g is 

f*9(s)= / t s - 1 (f*g)(t)dt = f(s)g(s). 
Jo 

Similarly find the Mellin transform of 

(f#g)(t)= r f{tu)g(u)du. 
Jo 

b) The unknown function F(t) satisfies Fox's integral equation, 

[■CO 

F{t) = G(t)+ / dvQ(tv)F(v), 
Jo 

in which G and Q are known. Solve for the Mellin transform F in terms 
of the Mellin transforms G and Q. 
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Exercise 9. 7: Some more easy problems: 

a) Solve the Lalesco-Picard integral equation 



u(x) = cos [IX + 



/ dye-\ x - y \u(y) 



J —oo 



b) For A / 3, solve the integral equation 

4>(x) = 1 + A / dyxy^)(y) 



o 



c) By taking derivatives, show that the solution of the Volterra equation 



Jo 

satisfies a first order differential equation. Hence, solve the integral equa- 
tion. 

Exercise 9.8: Principal part integrals. 
a) If w is real, show that 





(This is easier than it looks.) 
b) If y is real, but not in the interval (—1, 1), show that 



7-1 (y - x)Vl - x 2 
Now let y G (-1, 1). Show that 





(This is harder than it looks.) 



Exercise 9.9: 



Consider the integral equation 




in which only u is unknown. 
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a) Write down the solution u(x) to second order in the Liouville-Neumann- 
Born series. 

b) Suppose g(x) = x and K(x,y) = sm2nxy. Compute u(x) to second 
order in the Liouville-Neumann-Born series. 

Exercise 9.10:: Show that the application of the Fredholm series method to 
the equation 




gives 



and 




Appendix A 

Linear Algebra Review 



In solving the differential equations of physics we have to work with infinite 
dimensional vector spaces. Navigating these vast regions is much easier if you 
have a sound grasp of the theory of finite dimensional spaces. Most physics 
students have studied this as undergraduates, but not always in a systematic 
way. In this appendix we gather together and review those parts of linear 
algebra that we will find useful in the main text. 



A.l Vector Space 
A. 1.1 Axioms 

A vector space V over a field F is a set equipped with two operations: a 
binary operation called vector addition which assigns to each pair of elements 
x, y G V a third element denoted by x + y, and scalar multiplication which 
assigns to an element x G V and A G F a new element Ax G V. There is also 
a distinguished element G V such that the following axioms are obeyed 1 : 

1) Vector addition is commutative: x + y = y + x. 

2) Vector addition is associative: (x + y) + z = x + (y + z). 

3) Additive identity: + x = x. 

4) Existence of additive inverse: for any xGF, there is an element (— x) G 
V, such that x + (— x) = 0. 

5) Scalar distributive law i) A(x + y) = Ax + Ay. 

6) Scalar distributive law ii) (A + /x)x = Ax + /xx. 

: In this list 1, A, /x, e ¥ and x, y, e V. 
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7) Scalar multiplicatiion is associative: (A/i)x = A(/xx). 

8) Multiplicative identity: lx = x. 

The elements of V are called vectors. We will only consider vector spaces 
over the field of the real numbers, F = M, or the complex numbers, F = C. 

You have no doubt been working with vectors for years, and are saying to 
yourself "I know this stuff" . Perhaps so, but to see if you really understand 
these axioms try the following exercise. Its value lies not so much in the 
solution of its parts, which are easy, as in appreciating that these commonly 
used properties both can and need to be proved from the axioms. (Hint: 
work the problems in the order given; the later parts depend on the earlier.) 



Exercise A.l: Use the axioms to show that: 
If x + 6 = x, then 6 = 0. 

We have Ox = for any x £ V. Here is the additive identity in R. 
If x + y = 0, then y = — x. Thus the additive inverse is unique. 
Given x, y in V, there is a unique z such that x+z = y, to whit z = x— y. 
AO = for any A G F. 
If Ax = 0, then either x = or A = 0. 
(-l)x = -x. 



A. 1.2 Bases and components 

Let V be a vector space over F. For the moment, this space has no additional 
structure beyond that of the previous section — no inner product and so no 
notion of what it means for two vectors to be orthogonal. There is still much 
that can be done, though. Here are the most basic concepts and properties 
that you should understand: 

i) A set of vectors {ei, e 2 , . . . , e„} is linearly dependent if there exist A M G 
F, not all zero, such that 

A 1 ^ + A 2 e 2 + • • • + A n e n = 0. (A.l) 

ii) If it is not linearly dependent, a set of vectors {e^ e 2 , . . . , e n } is linearly 
independent. For a linearly independent set, a relation 

A^i + A 2 e 2 + • • • + A"e„ = (A.2) 

can hold only if all the A M are zero. 
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iii) A set of vectors {e 1; e 2 , . . . , e n } is said to span V if for any x E V there 
are numbers a;' 1 such that x can be written (not necessarily uniquely) 

as 

x = x 1 e 1 + x 2 e 2 H h x n e n . (A.3) 

A vector space is finite dimensional if a finite spanning set exists. 

iv) A set of vectors {ei, e 2 , . . . , e„} is a basis if it is a maximal linearly 
independent set {i.e. introducing any additional vector makes the set 
linearly dependent). An alternative definition declares a basis to be a 
minimal spanning set {i.e. deleting any of the destroys the spanning 
property). Exercise: Show that these two definitions are equivalent. 

v) If {ex, e 2 , . . . , e„} is a basis then any x e V can be written 

x = a^ei + x 2 e 2 + . . . x n e n , (A.4) 

where the x^ , the components of the vector with respect to this basis, 
are unique in that two vectors coincide if and only if they have the 
same components. 

vi) Fundamental Theorem: If the sets {e^ e 2 , . . . , e n } and {fi, f 2 , . . . , f m } 
are both bases for the space V then m = n. This invariant integer is 
the dimension, dim(V), of the space. For a proof (not difficult) see 
a mathematics text such as Birkhoff and McLane's Survey of Modern 
Algebra, or Halmos' Finite Dimensional Vector Spaces. 

Suppose that {ei, e 2 , . . . , e n } and {e' l5 e' 2 , . . . , e' n } are both bases, and that 

e u = <e;. (A.5) 

Since {ei, e 2 , . . . , e„} is a basis, the e' u can also be uniquely expressed in terms 
of the e M , and so the numbers constitute an invertible matrix. (Note that 
we are, as usual, using the Einstein summation convention that repeated 
indices are to be summed over.) The components x'^ of x in the new basis 
are then found by comparing the coefficients of e^ in 

= x = x»e u = x" (ajyj = e^ (A.6) 

to be x /fl = a^x v , or equivalently, x v = (a -1 )^a; /M . Note how the e M and the 
x^ transform in "opposite" directions. The components x^ are therefore said 
to transform contravariantly . 
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A. 2 Linear Maps 

Let V and W be vector spaces having dimensions n and m respectively. A 
linear map, or linear operator, A is a function A : V ^ W with the property 
that 

A(Ax + fiy) = AA(x) + M(y). (A.7) 



A. 2.1 Matrices 



The linear map A is an object that exists independently of any basis. Given 
bases {e M } for V and {f,,} for W, however, the map may be represented by 
an m-by-n matrix. We obtain this matrix 

( a\ 



a 2 1 



a 1 2 
a 2 2 



\a r 



a 2 



a m n ) 



(A.8) 



having entries a v ' , by looking at the action of A on the basis elements: 

A(e,) = U", • (A.9) 

To make the right- hand- side of (A.9) look like a matrix product, where we 
sum over adjacent indices, the array a v M has been written to the right of the 
basis vector 2 . The map y = A(x) is therefore 

y = y"f„ = A(x) = A(x^) = /i(e,) = i?(f v a\) = (a\x»%, (A.10) 

whence, comparing coefficients of f u , we have 

^ = a>^. (A.ll) 

The action of the linear map on components is therefore given by the usual 
matrix multiplication from the left: y = Ax, or more explicitly 



\y m J 



(a\ 



a 2 i 



\a r ' 



a l 2 
a 2 2 



a 2 



a 2 n 



a" 



\ 






x 2 


/ 


\x n ) 



(A.12) 



2 You have probably seen this "backward" action before in quantum mechanics. If we 
use Dirac notation \n) for an orthonormal basis, and insert a complete set of states, \m){m\, 
then A\n) = \m)(m\A\n). The matrix (m|A|n) representing the operator A operating on 
a vector from the left thus automatically appears to the right of the basis vectors used to 
expand the result. 
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The identity map I : V — > V is represented by the n-by-n matrix 



(\ 
1 







1 



\0 

which has the same entries in any basis. 



0\ 





1/ 



(A.13) 



Exercise A. 2: Let U, V, W be vector spaces, and A : V ^ W, B : U ^ V 
linear maps which are represented by the matrices A with entries a^ v and B 
with entries b^ u , respectively. Use the action of the maps on basis elements 
to show that the map AB : U W is represented by the matrix product AB 
whose entries are a^\b x u . 



A. 2. 2 Range-nullspace theorem 

Given a linear map A : V — > W, we can define two important subspaces: 

i) The 

kernel or nullspace is defined by 

Ker A = {x G V : A(x) = 0}. (A.14) 

It is a subspace of V. 

ii) The range or image space is defined by 

Im A = {y G TV : y = A(x), x G V}. (A.15) 

It is a subspace of the target space W. 
The key result linking these spaces is the range-nullspace theorem which 
states that 

dim (Ker A) + dim (Im A) = dim V 

It is proved by taking a basis, n M , for Ker A and extending it to a basis for the 
whole of V by appending (dimV — dim (Ker A)) extra vectors, e v . It is easy 
to see that the vectors A(e u ) are linearly independent and span ImA CW. 
Note that this result is meaningless unless V is finite dimensional. 

The number dim (Im A) is the number of linearly independent columns 
in the matrix, and is often called the (column) rank of the matrix. 
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A. 2. 3 The dual space 

Associated with the vector space V is its dual space, V*, which is the set of 
linear maps / : V — > F. In other words the set of linear functions /( ) that 
take in a vector and return a number. These functions are often also called 
covectors. (Mathematicians place the prefix co- in front of the name of a 
mathematical object to indicate a dual class of objects, consisting of the set 
of structure-preserving maps of the original objects into the field over which 
they are defined.) 

Using linearity we have 

/(x) = f(x»e„) = i"/(e,) = *" /„. (A.16) 

The set of numbers = /(e M ) are the components of the covector / e V*. 
If we change basis e u = a^e'^ then 

/„ = /(e„) = /(ajjey = </(ey = a%. (A.17) 

Thus f u = a%f and the components transform in the same manner as the 
basis. They are therefore said to transform covariantly. 

Given a basis e M of V, we can define a dual basis for V* as the set of 
covectors e*^ e V* such that 

e*^) = (A.18) 
It should be clear that this is a basis for V*, and that / can be expanded 

/ = (A.19) 

Although the spaces V and V* have the same dimension, and are therefore 
isomorphic, there is no natural map between them. The assignment e M i— > e* M 
is unnatural because it depends on the choice of basis. 

One way of driving home the distinction between V and V* is to consider 
the space V of fruit orders at a grocers. Assume that the grocer stocks only 
apples, oranges and pears. The elements of V are then vectors such as 

x = 3kg apples + 4.5kg oranges + 2kg pears. (A. 20) 

Take V* to be the space of possible price lists, an example element being 



/ = (£3.00/kg) apples* + (£2.00/kg) oranges* + (£1.50/kg) pears*. 

(A.21) 
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The evaluation of / on x 

/(x) = 3 x £3.00 + 4.5 x £2.00 + 2 x £1.50 = £21.00, (A.22) 

then returns the total cost of the order. You should have no difficulty in 
distinguishing between a price list and box of fruit! 

We may consider the original vector space V to be the dual space of V* 
since, given vectors in x e V and / 6 V", we naturally define x(/) to be 
/(x). Thus (V*)* = V. Instead of giving one space priority as being the set 
of linear functions on the other, we can treat V and V* on an equal footing. 
We then speak of the pairing of x e V with / e V* to get a number in the 
field. It is then common to use the notation (/, x) to mean either of /(x) or 
x(/). Warning: despite the similarity of the notation, do not fall into the 
trap of thinking of the pairing (/, x) as an inner product (see next section) of 
/ with x. The two objects being paired live in different spaces. In an inner 
product, the vectors being multiplied live in the same space. 

A. 3 Inner-Product Spaces 

Some vector spaces V come equipped with an inner (or scalar) product. This 
additional structure allows us to relate V and V*. 

A. 3.1 Inner products 

We will use the symbol (x, y) to denote an inner product. An inner (or 
scalar) product is a conjugate- symmetric, sesquilinear, non-degenerate map 
V x V — > F. In this string of jargon, the phrase conjugate symmetric means 
that 

(x,y) = (y,x)*, (A.23) 
where the "*" denotes complex conjugation, and sesquilinear 3 means 

(x,Ay + /iz) = A(x,y)+/i(x,z), (A.24) 
(Ax + /iy,z) = A*(x,z)+//(y,z). (A.25) 

The product is therefore linear in the second slot, but anti-linear in the 
first. When our field is the real numbers K then the complex conjugation is 



3 Sesqui is a Latin prefix meaning "one-and-a-half" . 
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redundant and the product will be symmetric 

(x,y) = (y,x), (A.26) 

and bilinear 

(x,Ay + Atz) = A(x,y)) +/i(x,z), (A.27) 
(Ax + //y,z) = A(x,z) +/i(y,z). (A.28) 

The term non- degenerate means that (x, y) = for all y implies that 
x = 0. Many inner products satisfy the stronger condition of being positive 
definite. This means that (x, x) > 0, unless x = 0, when (x, x) = 0. Positive 
definiteness implies non- degeneracy, but not vice-versa. 

Given a basis e M , we can form the pairwise products 

(e^,e u )=g^. (A.29) 

If the array of numbers g^ u constituting the components of the metric tensor 
turns out to be g^ = <5 M „, then we say that the basis is orthonormal with 
respect to the inner product. We will not assume ortho normality without 
specifically saying so. The non- degeneracy of the inner product guarantees 
the existence of a matrix g^ v which is the inverse of g^ u , i.e. g^g^ = 5*. 

If we take our field to be the real numbers 1R then the additional structure 
provided by a non-degenerate inner product allows us to identify V with V*. 
For any / G V* we can find a vector f G V such that 

/(x) = (f,x). (A.30) 

In components, we solve the equation 

U = 9^f u (A.31) 

for f u . We find f u = g Vii j il . Usually, we simply identify / with f, and hence 
V with V*. We say that the covariant components f ^ are related to the 
contravariant components / M by raising 

F = g» u f», (A.32) 

or lowering 

U = g^r, (A.33) 

the indices using the metric tensor. Obviously, this identification depends 
crucially on the inner product; a different inner product would, in general, 
identify an / G V* with a completely different f G V. 
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A. 3. 2 Euclidean vectors 

Consider W 1 equipped with its Euclidean metric and associated "dot" inner 
product. Given a vector x and a basis e M with g^ v = e fl -e u , we can define two 
sets of components for the same vector. Firstly the coefficients x^ appearing 
in the basis expansion 

X 

and secondly the "components" 

Xfj, x • Q/j,v% j 

of x along the basis vectors. The x^ are obtained from the x^ by the same 
"lowering" operation as before, and so x^ and x^ are naturally referred to 
as the contravariant and covariant components, respectively, of the vector x. 
When the e M constitute an orthonormal basis, then g^ v = 5^ v and the two 
sets of components are numerically coincident. 



A. 3. 3 Bra and ket vectors 

When our vector space is over the field of the complex numbers, the anti- 
linearity of the first slot of the inner product means we can no longer make a 
simple identification of V with V*. Instead there is an anti-linear correspo- 
nence between the two spaces. The vector x e V is mapped to (x, ) which, 
since it returns a number when a vector is inserted into its vacant slot, is an 
element of V*. This mapping is anti-linear because 

Ax + fiy h-> (Ax + py, ) = A*(x, )+/i*(y, ). (A.34) 

This antilinear map is probably familiar to you from quantum mechan- 
ics, where V is the space of Dirac's "ket" vectors and V* the space of 
"bra" vectors The symbol, here ip, in each of these objects is a label 
distinguishing one state-vector from another. We often use the eigenvalues 
of some complete set set of commuting operators. To each vector we use 
the (. . .y map to assign it a dual vector 

M ~ \^ = y,\ 

having the same labels. The dagger map is defined to be antilinear 

(A|^)+/i|x)) t = A*(^|+/i*(x|, 



(A.35) 



396 



APPENDIX A. LINEAR ALGEBRA REVIEW 



and Dirac denoted the number resulting from the pairing of the covector 
with the vector \x) by the "bra-c-ket" symbol (ip\x)' 



(i>\x) = m,\x))- 



(A.36) 



We can regard the dagger map as either determining the inner-product on V 

via 

(m, ix» = m\ \x)) = m, ix» = (m, (a.s?) 

or being determined by it as 

w = m, > = <# 



(A.38) 



When we represent our vectors by their components with respect to an 
orthonormal basis, the dagger map is the familiar operation of taking the 
conjugate transpose, 



\Xn) 



[X lJ x 2 i • • • , x n ) 



(A.39) 



but this is not true in general. In a non-orthogonal basis the column vector 
with components is mapped to the row vector with components (a^)^ = 

Much of Dirac notation tacitly assumes an orthonormal basis. For exam- 
ple, in the expansion 

M=X>>W> (A.40) 

n 

the expansion coefficients (n\ip) should be the contravariant components of 
\ip), but the (n\ip) have been obtained from the inner product, and so are in 
fact its covariant components. The expansion (A.40) is therefore valid only 
when the \n) constitute an orthonormal basis. This will always be the case 
when the labels on the states show them to be the eigenvectors of a complete 
commuting set of observables, but sometimes, for example, we may use the 
integer "n" to refer to an orbital centered on a particular atom in a crystal, 
and then (n\m) ^ 5 mn . When using such a non-orthonormal basis it is safer 
not to use Dirac notation. 
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Conjugate operator 

A linear map A : V — > W automatically induces a map A* : W* — > V*. 
Given / e VT* we can evaluate /(A(x)) for any x in V, and so )) is an 

element of V* that we may denote by A*(f). Thus, 

A*(/)(x)=/(A(x)). (A.41) 

Functional analysts call A* the conjugate of A. The word "conjugate" and 
the symbol A* is rather unfortunate as it has the potential for generating con- 
fusion 4 — not least because the (. . .)* map is linear. No complex conjugation 
is involved. Thus 

(XA + (j,B)* = \A* + (j,B*. (A.42) 

Dirac deftly sidesteps this notational problem by writing {ip\A for the 
action of the conjugate of the operator A : V — > V on the bra vector e V*. 
After setting / — > (0| and x — > equation (A.41) therefore reads 

«#4) |X) = M (4X» ■ (A-43) 

This shows that it does not matter where we place the parentheses, so Dirac 
simply drops them and uses one symbol to represent both sides 

of (A. 43). Dirac notation thus avoids the non-complex-conjugating "*" by 
suppressing the distinction between an operator and its conjugate. If, there- 
fore, for some reason we need to make the distinction, we cannnot use Dirac 
notation. 

Exercise A.3: If A : V -> V and B : V -> V show that {ABf = B* A* . 

Exercise A. 4: How does the reversal of the operator order in the previous 
exercise manifest itself in Dirac notation? 

A. 3. 4 Adjoint operator 

The "conjugate" operator of the previous section does not require an inner 
product for its definition, and is a map from V* to V*. When we do have an 
inner product, however, we can use it to define a different operator "conju- 
gate" to A that, like A itself, is a map from V to V. This new conjugate is 



4 Thc terms dual, transpose, or adjoint are sometimes used in place of "conjugate." 
Each of these words brings its own capacity for confusion. 
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called the adjoint or the Hermitian conjugate of A. To construct it, we first 
remind ourselves that for any linear map / : V — > C, there is a vector f e V 
such that /(x) = (f, x). (To find it we simply solve /„ = (f^)*g^u for 
We next observe that x i— > (y, Ax) is such a linear map, and so there is a z 
such that (y, Ax) = (z,x). It should be clear that z depends linearly on y, 
so we may define the adjoint linear map, A\ by setting A^y = z. This gives 
us the identity 

fr^Tx) = (Aty, x) 

The correspondence A \— > A^ is anti-linear 

(AA + //S) t = X*A^ + /j,*BK (AAA) 

The adjoint of A depends on the inner product being used to define it. Dif- 
ferent inner products give different A^s. 

In the particular case that our chosen basis e M is orthonormal with respect 
to the inner product, i.e. 

( e /ii e u) = &muvi (A. 45) 

then the Hermitian conjugate A' of the operator A is represented by the 
Hermitian conjugate matrix which is obtained from the matrix A by 
interchanging rows and columns and complex conjugating the entries. 

Exercise A. 5: Show that (AB)t = B*A*. 

Exercise A.6: When the basis is not orthonormal, show that 

{A*y a = (g^AW)* . (A.46) 

A. 4 Sums and Differences of Vector Spaces 
A. 4.1 Direct sums 

Suppose that U and V are vector spaces. We define their direct sum U © V 
to be the vector space of ordered pairs (u, v) with 

A(ui, vi) + At(u 2 , v 2 ) = (Aui + /iu 2 , Avi + /iv 2 ). (A.47) 

The set of vectors {(u, 0)} C U © V forms a copy of U, and {(0,v)} G U (BV 
a copy of V. Thus U and V may be regarded as subspaces of U © V. 
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If U and V are any pair of subspaces of W, we can form the space U + V 
consisting of all elements of W that can be written as u + v with u G U and 
v G V. The decomposition x = u + v of an element x G £/ + V into parts in 
[/ and V will be unique (in that Ui + vi = u 2 + v 2 implies that Ui = u 2 and 
Vi = v 2 ) if and only if U fl V — {0} where {0} is the subspace containing 
only the zero vector. In this case U + V can be identified with U © V. 

If U is a subspace of W then we can seek a complementary space V such 
that W = U © V, or, equivalently, W = [/ + V with f/nV = {0}. Such 
complementary spaces are not unique. Consider IR 3 , for example, with U 
being the vectors in the x, y plane. If e is any vector that does not lie in 
this plane then the one-dimensional space spanned by e is a complementary 
space for U . 

A. 4. 2 Quotient spaces 

We have seen that if U is a subspace of W there are many complementary 
subspaces V such that W — U © V. We may however define a unique space 
that we could write as W — U and call it the difference of the two spaces. 
It is more common, however, to see this space written as W/U and referred 
to as the quotient of W modulo U . This quotient space is the vector space 
of equivalence classes of vectors, where we do not distinguish between two 
vectors in W if their difference lies in U. In other words 

x = y (mod U) & x - y e U. (A.48) 

The collection of elements in W that are equivalent to x (mod U) composes 
a coset, written x + U, a set whose elements are x + u where u is any vector 
in U. These cosets are the elements of W/U. 

If we have a positive-definite inner product we can also define a unique 
orthogonal complement of U C W . We define U L to be the set 

U 1 - = {x e W : (x, y) = 0, Vy G U}. (A.49) 

It is easy to see that this is a linear subspace and that U © U L = W. For 
finite dimensional spaces 

dimW/U = dimU 1 - = dimW^ — dimC/ 

and (U-*-) 1 - = U. For infinite dimensional spaces we only have (t/ -1 )" 1 " ^ U. 
(Be careful, however. If the inner product is not positive definite, U and U L 
may have non-zero vectors in common.) 
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Although they have the same dimensions, do not confuse W/U with If- 1 -, 
and in particular do not use the phrase orthogonal complement without spec- 
ifying an inner product. 

A practical example of a quotient space occurs in digital imaging. A 
colour camera reduces the infinite-dimensional space £ of coloured light inci- 
dent on each pixel to three numbers, R, G and B, these obtained by pairing 
the spectral intensity with the frequency response (an element of £*) of the 
red, green and blue detectors at that point. The space of distingushable 
colours is therefore only three dimensional. Many different incident spectra 
will give the same output RGB signal, and are therefore equivalent as far 
as the camera is concerned. In the colour industry these equivalent colours 
are called metamers. Equivalent colours differ by spectral intensities that lie 
in the space B of metameric black. There is no inner product here, so it is 
meaningless to think of the space of distinguishable colours as being B L . It 
is, however, precisely what we mean by £/B. 

When we have a linear map A : U — > V, the quotient space V/lmA is 
often called the co-kernel of A. 

A. 4. 3 Projection-operator decompositions 

An operator P : V — > V that obeys P 2 = P is called a projection operator. 
It projects a vector x G V to Px G ImP along KerP — in the sense of 
casting a shadow onto Im P with the light coming from the direction Ker P. 
In other words all vectors lying in Ker P are killed, whilst any vector already 
in ImP is left alone by P. (If x G ImP then x = Py for some y G V, and 
Px = P 2 y = Py = x.) The only vector common to both KerP and ImP is 
0, and so 

V = KerP ©ImP. (A.50) 

Exercise A. 7: Let Pi be a projection operator. Show that P2 = I — Pi is 
also a projection operator and P1P2 = 0. Show also that ImP2 = Ker Pi and 
KerP 2 = ImPi. 

A. 5 Inhomogeneous Linear Equations 

Suppose we wish to solve the system of linear equations 



auVi + ai2?/2 + 



• + a ln y n 



= 61 
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02i2/i + a 2 2V2 H h a 2 „y., 



'n 



flmiyi + a m2 y2 H h a mn y. 



•n 



= k 



rn 



or, in matrix notation, 



Ay = b, 



(A.51) 



where A is the n-hy-m matrix with entries a^. Faced with such a problem, 
we should start by asking ourselves the questions: 

i) Does a solution exist? 

ii) If a solution does exist, is it unique? 

These issues are best addressed by considering the matrix A as a linear 
operator A : V — » W, where V is n dimensional and W is m dimensional. 
The natural language is then that of the range and nullspaces of A. There 
is no solution to the equation Ay = b when Im A is not the whole of W 
and b does not lie in Im A. Similarly, the solution will not be unique if 
there are distinct vectors xi, X2 such that Ax.i = Ax 2 . This means that 
A(x.i — x 2 ) = 0, or (xi — x 2 ) G Ker A. These situations are linked, as we 
have seen, by the range null-space theorem: 



Thus, if m > n there are bound to be some vectors b for which no solution 
exists. When m < n the solution cannot be unique. 

A. 5.1 Rank and index 

Suppose V = W (so m = n and the matrix is square) and we chose an inner 
product, (x, y), on V. Then x G Ker A implies that, for all y 



or that x is perpendicular to the range of A^. Conversely, let x be perpen- 
dicular to the range of A^; then 



dim (Ker A) + dim (Im A) = dim V. 



(A.52) 



0=(y,Ax) = (Aty,x), 



(A.53) 



(x,Aty) = 0, VyGV, 



(A.54) 



which means that 

(Ax,y) = 0, VyGV, (A.55) 
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and, by the non- degeneracy of the inner product, this means that Ax = 0. 
The net result is that 

KerA = (Im^) 1 . (A.56) 

Similarly 

KerA f = (Imi) 1 . (A.57) 

Now 

dim (Ker A) + dim (Im A) = dimV, 
dim(KerA t ) +dim(ImA t ) = dimV, (A.58) 

but 

dim (Ker A) = dim(ImA t ) ± 

= diml/ — dim (Im A^) 
= dim (Ker A^). 

Thus, for finite-dimensional square matrices, we have 

dim (Ker A) = dim (Ker A') 

In particular, the row and column rank of a square matrix coincide. 
Example: Consider the matrix 

..(:::) 

\2 3 4/ 

Clearly, the number of linearly independent rows is two, since the third row 
is the sum of the other two. The number of linearly independent columns is 
also two — although less obviously so — because 

-GMH) 

Warning: The equality dim (Ker A) = dim (Ker A*), need not hold in infi- 
nite dimensional spaces. Consider the space with basis ei, e 2 , e 3 , . . . indexed 
by the positive integers. Define Ae\ = e2, Ae.2 = e^, and so on. This op- 
erator has dim (Ker A) = 0. The adjoint with respect to the natural inner 
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product has A^ei = 0, A jf e 2 = ei, A^e 3 = e 2 . Thus Ker A^ = {ex}, and 
dim (Ker A' ) = 1. The difference dim (Ker A) — dim (Ker A') is called the in- 
dex of the operator. The index of an operator is often related to topological 
properties of the space on which it acts, and in this way appears in physics 
as the origin of anomalies in quantum field theory. 



A. 5. 2 Fredholm alternative 

The results of the previous section can be summarized as saying that the 
Fredholm Alternative holds for finite square matrices. The Fredholm Alter- 
native is the set of statements 
I. Either 

i) Ax = b has a unique solution, 

or 

ii) Ax. = has a solution. 

II. If Ax = has n linearly independent solutions, then so does A^x = 0. 
III. If alternative ii) holds, then Ax = b has no solution unless b is orthog- 
onal to all solutions of A^x = 0. 
It should be obvious that this is a recasting of the statements that 

dim (Ker A) = dim (KerA^), 

and 

(KerA t ) ± = ImA (A.59) 

Notice that finite-dimensionality is essential here. Neither of these statement 
is guaranteed to be true in infinite dimensional spaces. 



A. 6 Determinants 

A. 6.1 Skew-symmetric n-linear Forms 

You will be familiar with the elementary definition of the determinant of an 
n-bj-n matrix A having entries a^: 



det A = 



an ai2 

a 21 a 22 
a nl a n2 



din 

Q>2n 



dcf 



£ili2...in ( ^ln ( ^2i2 • • • ttnin- 



(A.60) 
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Here, e^...^ is the Levi-Civita symbol, which is skew-symmetric in all its 
indices and ei2... n = 1- From this definition we see that the determinant 
changes sign if any pair of its rows are interchanged, and that it is linear in 
each row. In other words 



Aan + fibn Aai 2 + £f&i 2 
C21 c 22 



\ai n + fib ln 

C2n 



C n l 



C„2 

an 012 
C21 c 22 

Cnl C„2 



air 

C2r, 



+ jJ> 



hi 612 

C21 C2 2 
Cnl C„2 



C2n 



If we consider each row as being the components of a vector in an n-dimensional 
vector space V, we may regard the determinant as being a skew-symmetric 
n-linear form, i.e. a map 



n factors 



u:V xV x ...F->F 
which is linear in each slot, 

w(Aa + //b,c 2 ,...,c„) = Aw(a,c 2 ,...,c n ) +^cj(b, c 2 , ...,c n ), 
and changes sign when any two arguments are interchanged, 
uj(. . . , a i; . . . , a,-, . . .) = -uj(. . . , a,-, . . . , a», . . .)■ 



(A.61) 



(A.62) 



(A.63) 



We will denote the space of skew-symmetric n-linear forms on V" by the 
symbol /\ n (V*). Let be an arbitrary skew-symmetric n-linear form in 
f\ n {V*), and let {ei, e 2 , . . . , e n } be a basis for V. If a, = a^ej 
is a collection of n vectors 5 , we compute 



,n) 



o;(ai, a 2 , . . . , a„) — 01^0212 ... a nin u{e il , e i2 , e in ) 

= a 1 i 1 a 2 i 2 ...a n i n €i 1 i 2 „. j i n u;(e 1 ,e2,...,e n ). (A. 64) 



5 The index j on ay should really be a superscript since is the j-th contravariant 
component of the vector aj. We are writing it as a subscript only for compatibility with 
other equations in this section. 
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In the first line we have exploited the linearity of uo in each slot, and in going 
from the first to the second line we have used skew-symmetry to rearrange 
the basis vectors in their canonical order. We deduce that all skew-symmetric 
n-forms are proportional to the determinant 

Oil &12 • • • a ln 
0,21 Gt 22 ■ ■ ■ d2n 

Q"nl ®n2 ■ ■ ■ Q>nn 

and that the proportionality factor is the number cu(ei, e 2 , . . . , e„). When 
the number of its slots is equal to the dimension of the vector space, there is 
therefore essentially only one skew-symmetric multilinear form and /\ n {V*) 
is a one-dimensional vector space. 

Now we use the notion of skew-symmetric n-linear forms to give a pow- 
erful definition of the determinant of an endomorphism, i.e. a linear map 
A : V — > V. Let u be a non-zero skew-symmetric n-linear form. The object 

um(xi, x 2 , . . . , x n ) = u{Ax. u Ax 2 , . . . , Ax n ). (A.65) 

is also a skew- symmetric n-linear form. Since there is only one such object 
up to multiplicative constants, we must have 

u;(Axi, Ax 2 , . . . , Ax n ) oc u;(xi, x 2 , . . . , x n ). (A. 66) 

We define "det A" to be the constant of proportionality. Thus 

u;(Axi, Ax 2 , . . . , Ax n ) = det (A)w(x 1? x 2 , . . . , x n ). (A.67) 

By writing this out in a basis where the linear map A is represented by the 
matrix A, we easily see that 

det A = det A. (AM) 

The new definition is therefore compatible with the old one. The advantage 
of this more sophisticated definition is that it makes no appeal to a basis, and 
so shows that the determinant of an endomorphism is a basis-independent 
concept. A byproduct is an easy proof that det (AB) = det (A)det (B), a 



a>(ai,a 2 , . . . , a„) oc 
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result that is not so easy to establish with the elementary definition. We 
write 

det (AB)u;(xi,X2, . . . ,x„) = u;(ABxi, ABx 2 , . . . , ABx n ) 

= u(A(Bx 1 ),A(Bx 2 ),...,A(Bx n )) 
= det (A)u(ByLi, -Bx 2 , . . . , -Bx n ) 
= det (A)det (5)cj(xi,x 2 , . . . ,x n ). 

(A.69) 

Cancelling the common factor of u;(xi, x 2 , . . . , x n ) completes the proof. 

Exercise A. 8: Let w be a skew-symmetric n-linear form on an n-dimensional 
vector space. Assuming that lo does not vanish identically, show that a set of 
n vectors xi,x 2 , . . . ,x n is linearly independent, and hence forms a basis, if, 
and only if, w(xi, x 2 , . . . , x n ) / 0. 



Exercise A. 9: Let 



a b 
c d 



be a partitioned matrix where a is m-by-m and d n-by-n. By making a 
Gaussian decomposition 



I m x \ / Ai \ / I m 



o i n J \ o a 2 ; V y in J ' 

show that, for invertible d, we have Schur's determinant formula 6 

det A = det(d) det (a - bd _1 c). 



A. 6. 2 The adjugate matrix 

Given a square matrix 

/ an Oi2 
a 2 i a 22 

A = 



Ol„ \ 

a 2n 



(A.70) 



3 I. Schur, J. /Mr reme und angewandte Math., 147 (1917) 205-232. 
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and an element a^-, we define the corresponding minor to be the deter- 
minant of the (n — l)-by-(n — 1) matrix constructed by deleting from A the 
row and column containing a^. The number 

A l3 = (-l)^My (A.71) 

is then called the co-factor of the element a^. (It is traditional to use up- 
percase letters to denote co-factors.) The basic result involving co-factors is 
that 

a,ijA Vj = fedet A. (A. 72) 

3 

When i = i', this is known as the Laplace development of the determinant 
about row i. We get zero when i ^ i' because we are effectively developing 
a determinant with two equal rows. We now define the adjugate matrix 7 , 
Adj A, to be the transposed matrix of the co-factors: 

iAdjA),, ■[„■ (A.73) 

In terms of this we have 

A(AdjA) = (detA)I. (A.74) 

In other words 

A" 1 = ^-Adj A. (A.75) 

detA v 1 

Each entry in the adjugate matrix is a polynomial of degree n — 1 in the 

entries of the original matrix. Thus, no division is required to form it, and 

the adjugate matrix exists even if the inverse matrix does not. 

Cayley's theorem 

You will know that the possible eigenvalues of the n-by-n matrix A are given 
by the roots of its characteristic equation 

= det (A - AI) = (-l) n (A n - tr (A)A n_1 + • • • + (-l) n det (A)) , (A.76) 

and have probably met with Cayley 's theorem that asserts that every matrix 
obeys its own characteristic equation. 

A n - tr (A)A"- 1 + • ■ • + (-l)"det (A)I = 0. (A.77) 



7 Some authors rather confusingly call this the adjoint matrix. 
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The proof of Cayley's theorem involves the adjugate matrix. We write 

det (A - AI) = (-1)" (A™ + aA"- 1 + ••• + «„) (A.78) 
and observe that 

det (A - AI)I = (A - AI)Adj (A - AI). (A.79) 

Now Adj (A — AI) is a matrix- valued polynomial in A of degree n — 1, and it 
can be written 

Adj (A - AI) = CoA"- 1 + dA™" 2 + • • • + C n _i, (A.80) 

for some matrix coefficients Cj. On multiplying out the equation 

(_!)« (a™ + ai \ n - 1 + h a n ) I = (A — AI)(CoA n ~ 1 + dA n - 2 + • • • + C n _i) 

(A.81) 

and comparing like powers of A, we find the relations 

(-l) n I = -C , 
(-l)W = -Ci + ACo, 
(-l) n a 2 I = -C 2 + ACi, 

(-l) n a n _il = -C n _!+AC n _ 2 , 
{-l) n a n l = AC n _i. 

Multiply the first equation on the left by A™, the second by A" -1 , and so 
on down the last equation which we multiply by A = I. Now add. We find 
that the sum telescopes to give Cayley's theorem, 

A" + ftjA"- 1 + • • • + a n I = 0, 

as advertised. 

A. 6. 3 Differentiating determinants 

Suppose that the elements of A depend on some parameter x. From the 
elementary definition 

det A = ^i 1 i 2 ...i n O'lii(l2i2 ■ ■ ■ &ni n > 
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we find 



dx ^ _ eil * 2 '"* n ( a l*i° 2 *2 ' ' ' a ni n + a lii a 2i 2 



In other words, 
d 



+ 1" dlhd2i2 ■ ■ ■ a 'ni n ) ■ 

(A.82) 



dx 



det A 



a' u a' 12 

0-21 0-22 



"In 

0-2n 



O-nl 0"n2 



a ll a 12 
a 21 a 22 



Q"nl 0>n2 



din 
a 2n 



+ ■■■ + 



an &i2 

0.21 0,22 



a 'nl a 'n2 



The same result can also be written more compactly as 

d , . V — ^ ddij 

2 < ~^ Ai 



det A — / 

dx dx 



(A.83) 



where Aij is cofactor of a^. Using the connection between the adjugate 
matrix and the inverse, this is equivalent to 



1 d , f dA. i 
-— det A = tr <^ —A" 1 

det A ax dx 



or 



d 

dx 



ln(detA)= tr {fA-}. 



A special case of this formula is the result 

d 



dciij 



In (det A) = (A" 1 ) 



(A.84) 
(A.85) 

(A.86) 



Olr. 
02r. 



A. 7 Diagonalization and Canonical Forms 

An essential part of the linear algebra tool-kit is the set of techniques for the 
reduction of a matrix to its simplest, canonical form. This is often a diagonal 
matrix. 



A. 7.1 Diagonalizing linear maps 

A common task is the diagonalization of a matrix A representing a linear 
map A. Let us recall some standard material relating to this: 
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i) If Ax. = Ax for a non-zero vector x, then x is said to be an eigenvector 
of A with eigenvalue A. 

ii) A linear operator A on a finite-dimensional vector space is said to be 
self-adjoint, or Hermitian, with respect to the inner product ( , ) if 
A = A\ or equivalently if (x, Ay) = (Ax, y) for all x and y. 

iii) If A is Hermitian with respect to a positive definite inner product ( , ) 
then all the eigenvalues A are real. To see that this is so, we write 

A(x, x) = (x, Ax) = (x, Ax) = (Ax, x) = (Ax, x) = A*(x, x). (A.87) 

Because the inner product is positive definite and x is not zero, the 
factor (x, x) cannot be zero. We conclude that A = A*. 

iii) If A is Hermitian and Aj and Xj are two distinct eigenvalues with eigen- 
vectors Xj and xj, respectively, then (xj,Xj) = 0. To prove this, we 
write 

X j {x i ,-x. j ) = (xi,Axj) = (Axi,Xj) = (AjX^Xj) = A*(xj,x,). (A. 88) 

But A* = Aj, and so (Aj — Xj)(x i ,Xj) = 0. Since, by assumption, 
(Aj — Aj) 7^ we must have (x i? Xj) = 0. 

iv) An operator A is said to be diagonalizable if we can find a basis for V 
that consists of eigenvectors of A. In this basis, A is represented by the 
matrix A = diag (Ai, A2, • • • , A n ), where the Aj are the eigenvalues. 

Not all linear operators can be diagonalized. The key element determining 
the diagonalizability of a matrix is the minimal polynomial equation obeyed 
by the matrix representing the operator. As mentioned in the previous sec- 
tion, the possible eigenvalues an n-by-n matrix A are given by the roots of 
the characteristic equation 

= det (A - AI) = (-l) n (A n - tr (A)A^ 1 + • • • + (-l) n det (A)) . 

This is because a non-trivial solution to the equation 

Ax = Ax (A.89) 

requires the matrix A— AI to have a non-trivial nullspace, and so det (A — AI) 
must vanish. Cayley's Theorem, which we proved in the previous section, 
asserts that every matrix obeys its own characteristic equation: 

A" - tr (A)A™- 1 + • • • + (-l) n det (A)I = 0. 
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The matrix A may, however, satisfy an equation of lower degree. For exam- 
ple, the characteristic equation of the matrix 

is (A — Ai) 2 . Cayley therefore asserts that (A — Ail) 2 = 0. This is clearly 
true, but A also satisfies the equation of first degree (A — Ail) = 0. 

The equation of lowest degree satisfied by A is said to be the minimal 
polynomial equation. It is unique up to an overall numerical factor: if two 
distinct minimal equations of degree n were to exist, and if we normalize 
them so that the coefficients of A n coincide, then their difference, if non- 
zero, would be an equation of degree < (n — 1) obeyed by A - - and a 
contradiction to the minimal equation having degree n. 

If 

P(A) = (A - Ail)" 1 (A - A 2 I) Q2 • ■ ■ (A - X n lf n = (A.91) 

is the minimal equation then each root A, is an eigenvalue of A. To prove 
this, we select one factor of (A — XJ.) and write 

P(A) = (A-\iS)Q(A), (A.92) 

where Q(A) contains all the remaining factors in P(A). We now observe 
that there must be some vector y such that x = Q(A)y is not zero. If there 
were no such y then Q(A) = would be an equation of lower degree obeyed 
by A in contradiction to the assumed minimality of P(A). Since 

= P(A)y = (A - Ail)x (A.93) 

we see that x is an eigenvector of A with eignvalue A«. 

Because all possible eigenvalues appear as roots of the characteristic equa- 
tion, the minimal equation must have the same roots as the characteristic 
equation, but with equal or lower multiplicities ctj. 

In the special case that A is self-adjoint, or Hermitian, with respect to a 
positive definite inner product ( , ) the minimal equation has no repeated 
roots. Suppose that this were not so, and that A has minimal equation 
(A — AI) 2 i?(A) = where R{A) is a polynomial in A. Then, for all vectors 
x we have 



(A.90) 



= (Rx, (A - AI) 2 i?x) = ((A - XL)Bx, (A - XI)Rx). (A.94) 
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Now the vanishing of the rightmost expression shows that (A— AI)i?(A)x = 
for all x. In other words 

(A - AI)i?(A) = 0. (A.95) 

The equation with the repeated factor was not minimal therefore, and we 
have a contradiction. 

If the equation of lowest degree satisfied by the matrix has no repeated 
roots, the matrix is diagonalizable; if there are repeated roots, it is not. The 
last statement should be obvious, because a diagonalized matrix satisfies an 
equation with no repeated roots, and this equation will hold in all bases, 
including the original one. The first statement, in combination with with 
the observation that the minimal equation for a Hermitian matrix has no 
repeated roots, shows that a Hermitian (with respect to a positive definite 
inner product) matrix can be diagonalized. 

To establish the first statement, suppose that A obeys the equation 

= P(A) = (A - AJ) (A - A 2 I) • • • (A - A J), (A.96) 
where the Aj are all distinct. Then, setting x — > A in the identity 8 
t _ (x - X 2 )(x - A 3 ) • • • (x - A„) | (x - Xi)(x - A3) • ■ ■ (x - An) | 



(Ai - A 2 )(Ai - A 3 ) • • • (Ax - A n ) (A 2 - Ax)(A 2 - A 3 ) • • • (A 2 - A n ) 

| (x- XJjx - X 2 ) • • • (x - A ra _i) ^ g ^ 

(A n — Ai)(A n — A 2 ) ■ • • (A n — A n _i) 

where in each term one of the factors of the polynomial is omitted in both 
numerator and denominator, we may write 

I = Pi + P 2 + ■■■ + P n , (A.98) 

where 

(A-A 2 I)(A-A 3 I)-..(A-A n I) 

1 (A 1 -A 2 )(A 1 -A 3 )..-(A 1 -A n ) ' lA ' yyj 

etc. Clearly PjPj = if i ^ j, because the product contains the minimal 
equation as a factor. Multiplying (A.98) by Pj therefore gives V} = Pj, 
showing that the Pj are projection operators. Further (A — AT) (Pi) = 0, so 

(A-A i I)(P i x) = (A.100) 



8 The identity may be verified by observing that the difference of the left and right hand 
sides is a polynomial of degree n—1, which, by inspection, vanishes at the n points x = Aj. 
But a polynomial which has more zeros than its degree, must be identically zero. 
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for any vector x, and we see that P^x, if not zero, is an eigenvector with 
eigenvalue Aj. Thus P« projects onto the i-th eigenspace. Any vector can 
therefore be decomposed 

X = P lX + P 2 x+--- + P n x 

= xi + x 2 + --- + x n , (A.101) 

where Xj, if not zero, is an eigenvector with eigenvalue \. Since any x can 
be written as a sum of eigenvectors, the eigenvectors span the space. 



Jordan decomposition 

If the minimal polynomial has repeated roots, the matrix can still be re- 
duced to the Jordan canonical form, which is diagonal except for some l's 
immediately above the diagonal. 

For example, suppose the characteristic equation for a 6-by-6 matrix A 

is 



= det (A - AI) = (Ai - A) 3 (A 2 - A) 3 , 



but the minimal equation is 



(A.102) 



= (A 1 -A) 3 (A 2 -A) 2 . 



(A.103) 



Then the Jordan form of A might be 



T -i AT 





1 











o \ 





Ai 


1 

















Ai 




















A 2 


1 

















A 2 



















A 2 / 



(A.104) 



One may easily see that (A.103) is the minimal equation for this matrix. The 
minimal equation alone does not uniquely specify the pattern of Aj's and l's 
in the Jordan form, though. 

It is rather tedious, but quite straightforward, to show that any linear 
map can be reduced to a Jordan form. The proof is sketched in the following 
exercises: 
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Exercise A. 10: Suppose that the linear operator T is represented by an N x N 
matrix, where N > 1. T obeys the equation 

(T - A/) p = 0, 

with p = N, but does not obey this equation for any p < N. Here A is a 
number and / is the identity operator. 

i) Show that if T has an eigenvector, the corresponding eigenvalue must be 
A. Deduce that T cannot be diagonalized. 

ii) Show that there exists a vector ei such that (T — XI) N ei = 0, but no 
lesser power of (T — XI) kills ei. 

hi) Define e 2 = (T — A/)ei, e,3 = (T — A/) 2 ei, etc. up to ejy. Show that the 

vectors ei, . . . , ejv are linearly independent, 
iv) Use ei, . . . , as a basis for your vector space. Taking 



ei 














1 






w 



e N 







write out the matrix representing T in the basis. 
Exercise A. 11: 

Let T : V — > V be a linear map and suppose that the minimal polynomial 
equation satisfied by T is 



Q{T) = (T - Ai/) ri (T - A 2 /p . . . (T — \ n I) r 



0. 



Let V\ t denote the space of generalized eigenvectors for the eigenvalue Aj. This 
is the set of x such that (T — Aj/) ri x = 0. You will show that 



Consider the set of polynomials Q\ it j(t) = (t — \i)~^ ri ^^ +l ^Q{t) where 
j = l,...,rj. Show that this set of N = J2i r i polynomials forms a 
basis for the vector space J 7 /v_i(t) of polynomials in t of degree no more 
than N — 1. (Since the number of Q\ u j is N, and this is equal to the 
dimension of J-^-iit), the claim will be established if you can show that 
the polynomials are linearly independent. This is easy to do: suppose 

that 

^2 a \i,jQ\i,j(t) =o. 
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Set t = Aj and deduce that a\ i i = 0. Knowing this, differentiate with 
respect to t and again set t = A, and deduce that 0^,2 = 0, and so on. ) 

ii) Since the Q\ u j form a basis, and since 1 £ Tn-i, argue that we can find 

such that 

Now define ^ 
and so 

/ = E P - (*) 

A, 

Use the minimal polynomial equation to deduce that P%Pj = if i 7^ j. 
Multiplication of* by Pi then shows that P%Pj = $ijPj- Deduce from this 
that * is a decomposition of the identity into a sum of mutually orthog- 
onal projection operators Pi that project onto the spaces V^. Conclude 
that any x can be expanded as x = ^ • x, with Xj = PjX € V\ i . 

iii) Show that the decomposition also implies that V\ i PI V\ J = {0} if i / 
j. (Hint: a vector in V\ i is called by all projectors with the possible 
exception of Pi and a vector in V\ j will be killed by all the projectors 
with the possible exception of Pj. ) 

iv) Put these results together to deduce that V is a direct sum of the V\ i . 

v) Combine the result of part iv) with the ideas behind exercise A. 10 to 
complete the proof of the Jordan decomposition theorem. 

A. 7.2 Diagonalizing quadratic forms 

Do not confuse the notion of diagonalizing the matrix representing a lin- 
ear map A : V — > V with that of diagonalizing the matrix representing a 
quadratic form. A (real) quadratic form is a map Q : V — > R, which is 
obtained from a symmetric bilinear form B : V x V — > R by setting the two 
arguments, x and y, in -B(x, y) equal: 

Q(x) = 5(x,x). (A.105) 

No information is lost by this specialization. We can recover the non-diagonal 
(x 7^ y) values of B from the diagonal values, Q(x), by using the polarization 
trick 

B(x, y) = i[Q(x + y) - Q(x) - Q(y)]. (A.106) 
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An example of a real quadratic form is the kinetic energy term 

T(x) = ^m ij x i x j = ix T Mx (A. 107) 

in a "small vibrations" Lagrangian. Here, M, with entries m^-, is the mass 
matrix. 

Whilst one can diagonalize such forms by the tedious procedure of finding 
the eigenvalues and eigenvectors of the associated matrix, it is simpler to use 
Lagrange's method, which is based on repeatedly completing squares. 

Consider, for example, the quadratic form 



Q = x 2 — y 2 — z 2 + 2xy — Axz + 6yz — (x, y, z 

We complete the square involving x: 

Q = (x + y - 2z) 2 - 2y 2 + lOyz - 5z 2 , (A. 109) 

where the terms outside the squared group no longer involve x. We now 
complete the square in y: 

Q = (x + y- 2zf - (V2y - ^=z) 2 + ^z 2 , (A.110) 

so that the remaining term no longer contains y. Thus, on setting 

f = x + y-2z, 
V 




V2y - ±,, 




we have 



1 0~ 

Q = e-v 2 + C 2 = U,v,C) [ o -l o I I «7 I - (A.iii) 

1 
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If there are no x 2 , y 2 , or z 2 terms to get us started, then we can proceed by 
using (x + y) 2 and (x — y) 2 . For example, consider 

Q = 2xy + 2yz + 2zy, 
I i 
= ^(x + y) 2 - -(x -y) 2 + 2xz + 2yz 

= \{x + y? + 2(x + y)z-^(x-y) 2 
= ^{x + y + 2z) 2 ~^{x-y) 2 -4z 2 

= e-v 2 -c 2 , 

where 

£ = -^{x + y + 2z), 
C = V2z. 

A judicious combination of these two tactics will reduce the matrix represent- 
ing any real quadratic form to a matrix with ±l's and O's on the diagonal, 
and zeros elsewhere. As the egregiously asymmetric treatment of x, y, z 
in the last example indicates, this can be done in many ways, but Cayley's 
Law of Inertia asserts that the number of +l's, — l's and O's will always be 
the same. Naturally, if we allow complex numbers in the redefinitions of the 
variables, we can always reduce the form to one with only +l's and O's. 

The essential difference between diagonalizing linear maps and diagonal- 
izing quadratic forms is that in the former case we seek matrices A such that 
A _1 MA is diagonal, whereas in the latter case we seek matrices A such that 
A T MA is diagonal. Here, the superscript T denotes transposition. 



Exercise A. 12: Show that the matrix 

Q = 

representing the quadratic form 



a b 
b c 



Q(x, y) = ax 2 + 2bxy + cy 2 
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may be reduced to 

(o ?)' (o -i)' ° r (o o)' 

depending on whether the discriminant, ac — 6 2 , is respectively greater than 
zero, less than zero, or equal to zero. 

Warning: You might be tempted to refer to the discriminant ac—b 2 as being 
the determinant of Q. It is indeed the determinant of the matrix Q, but there 
is no such thing as the "determinant" of the quadratic form itself. You may 
compute the determinant of the matrix representing Q in some basis, but if 
you change basis and repeat the calculation you will get a different answer. 
For real quadratic forms, however, the sign of the determinant stays the same, 
and this is all that the discriminant cares about. 

A. 7.3 Block-diagonalizing symplectic forms 

A skew-symmetric bilinear form u : V x V — > K is often called a symplectic 
form. Such forms play an important role in Hamiltonian dynamics and in 
optics. Let {e;} be a basis for V, and set 

u(ei,ej) = Uij. (A.112) 

If x = x l ei and y = y l e, h we therefore have 

w(x,y) = wfe^jjiy = Uijx'y 3 . (A.113) 

The numbers u>ij can be thought of as the entries in a real skew-symmetric 
matrix S7, in terms of which u;(x, y) = x T f2y. We cannot exactly "diagonal- 
ize" such a skew-symmetric matrix because a matrix with non-zero entries 
only on its principal diagonal is necessarily symmetric. We can do the next 
best thing, however, and reduce O to block diagonal form with simple 2-by-2 
skew matrices along the diagonal. 
We begin by expanding u as 

u = ^u ij e* i A,e* j (A. 114) 

where the wedge (or exterior) product e* J A e* J of a pair of basis vectors in 
V* denotes the particular skew-symmetric bilinear form 

e«Ae*t(e*,ep) (A.115) 
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Again, if x = x l ei and y = y l ei, we have 

e«Ae*-?(x,y) = e«Ae^(/e a)!/ %) 

= xV-yV. (A. 116) 

Consequently 

w(x, y) = -UijixY - yV) = c^xY, (A.117) 

as before. We extend the definition of the wedge product to other elements 
of V* by requiring "A" to be associative and distributive, taking note that 

e « A e* 1 = -e* j A e", (A.118) 

and so = e* 1 A e* 1 = e* 2 A e* 2 , etc. 

We next show that there exists a basis {f**} of V* such that 



CO 



f *i A f *2 _|_ f * 3 /\ f * 4 _|_ . . . _|_ f *(P-i) A f*p. (A.119) 



Here, the integer p < n is the ran/c of a;. It is necessarily an even number. 

The new basis is constructed by a skew-analogue of Lagrange's method 
of completing the square. If 

uj = A e* j (A. 120) 

is not identically zero, we can, after re-ordering the basis if neceessary, assume 
that cui2 ^ 0. Then 

u = ( e* 1 - — (w 23 e* 3 + • • • + u 2n e* n ) J A(wi 2 e* 2 +cJi 3 e* 3 + - • -cc>i n e* n ) W 3} 

V ^12 / 

(A.121) 

where G A 2 (^*) does not contain e* 1 or e* 2 . We set 

f* 1 = e* 1 - — (cu 23 e* 3 + ■■■ + cu 2n e* n ) (A.122) 

and 

f* 2 = cu 12 e* 2 + uj 13 e* 3 + ■■■ cu ln e* n . (A.123) 

Thus, 

w = f* 1 Af 2 + cu {3} . (A.124) 
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If the remainder cu^ is identically zero, we are done. Otherwise, we apply 
the same same process to so as to construct f* 3 , f* 4 and uj^\ we continue 
in this manner until we find a remainder, u^ p+1 \ that vanishes. 

Let {fj} be the basis for V dual to the basis {f**}. Then cj(f!,f 2 ) = 
— cj(f 2 , fi) = cj(f 3 , f 4 ) = — u;(f4, f 3 ) = 1, and so on, all other values being zero. 
This shows that if we define the coefficients a l j by expressing f* = a^e*- 7 , 
and hence = fja J j, then the matrix ft has been expressed as 



ft = A 1 ft A, 



(A.125) 



where A is the matrix with entries a l j, and ft is the matrix 



ft 



( 1 

-1 



V 



1 
-1 



which contains p/2 diagonal blocks of 



(A.126) 



1 
-1 0/' 



and all other entries are zero. 

Example: Consider the skew bilinear form 



u;(x,y) = x fly = (x\x 2 ,x 3 ,x 4 ) 



( 1 3 0\ (y x \ 

-1 15 
-3-10 

V -5 Oj \y A ) 



This corresponds to 

u = e* 1 A e* 2 + 3e* x A e* 3 + e* 2 A e* 3 + 5e* 2 A e* 4 . 



(A. 127) 



(A.128) 



(A.129) 



Following our algorithm, we write u as 



u = (e* 1 - e* 3 - 5e* 4 ) A (e* 2 + 3e* 3 ) - 15e* 3 A e* 4 . 



(A.130) 
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If we now set 



£*4 



*1 



*3 



e — e 

e* 2 + 3e 
-15e* 3 , 



5e 



*4 



we have 



= e 



*4 



p*4 



u = r 1 A f" + f* d A f* 
We have correspondingly expressed the matrix f2 as 



(A.131) 
(A.132) 



/ o 

-1 
-3 

V o 



1 



-1 

-5 



3 0\ 
1 5 


o o/ 



/ 1 



-1 

V-5 





1 

3 








-15 

1/ 



0\ / 

-1 



1 





V 





-1 



\ 

1 

0/ 



/ 1 o 

1 



\0 



-1 

3 

-15 




(A.133) 



Exercise A. 13: Let ft be a skew symmetric 2n-by-2n matrix with entries 



UJ. 



-ojji. Define the Pfaffian of ft by 
2 n nl ^ 1 

Show that Pf (M T ftM) = det (M) Pf (ft). By reducing ft to a suitable canon- 
ical form, show that (Pf ft) 2 = det ft. 

Exercise A. 14: Let u(x,y) be a non-degenerate skew symmetric bilinear form 
on M. 2n , and xi, . . . X2„ a set of vectors. Prove Weyl's identity 

Pf (ft ) det | Xi , . . . X 2 „ | = e h , ■ ■ ■ ,i2n ^ ( x * 1 . x *2 ) • • • w (x i2n _ x , X; 2n ) . 

Here det |xi, . . . X2 ra | is the determinant of the matrix whose rows are the Xj 
and ft is the matrix corresponding to the form to. 

Now let M : R 2n -> R 2n be a linear map. Show that 
Pf (ft) (det M) det |xi, . . . x 2n | 

Deduce that if uj(Mx, My) = w(x, y) for all vectors x, y, then det M = 1. 
The set of such matrices M that preserve uj compose the symplectic group 
Sp(2n,R) 
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Appendix B 

Fourier Series and Integrals. 



Fourier series and Fourier integral representations are the most important 
examples of the expansion of a function in terms of a complete orthonormal 
set. The material in this appendix reviews features peculiar to these special 
cases, and is intended to complement the the general discussion of orthogonal 
series in chapter 2. 

B.l Fourier Series 

A function defined on a finite interval may be expanded as a Fourier series. 
B.l.l Finite Fourier series 

Suppose we have measured f(x) in the interval [0, L], but only at the discrete 
set of points x = na, where a is the sampling interval and n — 0, 1, . . . , TV — 1, 
with Na = L . We can then represent our data }'{na) by a finite Fourier 
series. This representation is based on the geometric sum 



where k m = 2irm/Na. For integer n, and n', the expression on the right 
hand side of (B.l) is zero unless n' — n' is an integer multiple of N, when 
it becomes indeterminate. In this case, however, each term on the left hand 
side is equal to unity, and so their sum is equal to N. If we restrict n and n' 



N-l 



2ni(n—n')a 





(B.l) 



m=0 
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to lie between and N — 1 , we have 

N-l 

e ikm(n'-n)a = _ ( R2 ) 

Inserting (B.2) into the formula 

N-l 

f(na) = ^ f( n ' a ) S n'n, (B.3) 

n'=0 

shows that 

N-l j N-l 

f(na) = ^2 a m e~ ikmna , where a m E-^ f(na)e ikmfia . (B.4) 

m=0 n=0 

This is the finite Fourier representation. 

When f(na) is real, it is convenient to make the k m sum symmetric about 
k m = by taking N = 2M + 1 and setting the summation limits to be ±M. 
The finite geometric sum then becomes 

f e- = sin(2M+ ( 1 "' /2 . (B.5) 
^ sin^/2 v ; 

rn=-M ' 

We set 6 = 2n(n' — n)/N and use the same tactics as before to deduce that 

M 

f(na)= a m e-^ na , (B.6) 

m=~M 

where again k m = 2nm/L, with L = Na, and 

2M 



n=0 

In this form it is manifest that / being real both implies and is implied by 

These finite Fourier expansions are algebraic identities. No limits have 
to be taken, and so no restrictions need be placed on f(na) for them to be 
valid. They are all that is needed for processing experimental data. 

Although the initial f(na) was defined only for the finite range < n < 
N — 1, the Fourier sum (B.4) or (B.7) is defined for any n, and so extends / 
to a periodic function of n with period N. 
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B.1.2 Continuum limit 

Now we wish to derive a Fourier representation for functions defined every- 
where on the interval [0, L], rather just at the sampling points. The natural 
way to proceed is to build on the results from the previous section by re- 
placing the interval [0, L] with a discrete lattice of N = 2M + 1 points at 
x = na, where a is a small lattice spacing which we ultimately take to zero. 
For any non-zero a the continuum function f(x) is thus replaced by the finite 
set of numbers f(na). If we stand back and blur our vision so that we can no 
longer perceive the individual lattice points, a plot of this discrete function 
will look little different from the original continuum f(x). In other words, 
provided that / is slowly varying on the scale of the lattice spacing, f(an) 
can be regarded as a smooth function of x = an. 

The basic "integration rule" for such smooth functions is that 

a^^f(an)^ / f(an)adn^ / f(x)dx, (B.8) 

n J J 

as a becomes small. A sum involving a Kronecker 5 will become an integral 
containing a Dirac 5-function: 

a^2f(na) - 5 nm = f(ma) -> J f(x)S(x - y) dx = f(y). (B.9) 

n 

We can therefore think of the 5 function as arising from 

5 -^L^8{x-x'). (B.10) 
a 

In particular, the divergent quantity 6(0) (in x space) is obtained by setting 
n — n' ', and can therefore be understood to be the reciprocal of the lattice 
spacing, or, equivalently, the number of lattice points per unit volume. 

Now we take the formal continuum limit of (B.7) by letting a — > and 
iV — > oo while keeping their product Na = L fixed. The finite Fourier 
representation 

M 



f(na)= J2 ^e-^ na (B.ll) 

m=-M 

now becomes an infinite series 

oo 

f(x)= a m e- 2 ~/ L , (B.12) 



426 



APPENDIX B. FOURIER SERIES AND INTEGRALS. 



whereas 



N-l 



rf'<* 



a 




2mmx I L 



(B.13) 



a 



n=0 



The series (B.12) is the Fourier expansion for a function on a finite interval. 
The sum is equal to f(x) in the interval [0, L]. Outside, it produces L-periodic 
translates of the original /. 

This Fourier expansion (B.12, B.13) is same series that we would obtain 
by using the L 2 [0,L] orthonormality 



and using the methods of chapter two. The arguments adduced there, how- 
ever, guarantee convergence only in the L 2 sense. While our present "contin- 
uum limit" derivation is only heuristic, it does suggest that for reasonably- 
behaved functions / the Fourier series (B.12) converges pointwise to f(x). It 
is relatively easy to show that any continuous function is sufficiently "well- 
behaved" for pointwise convergence. Furthermore, if the function / is smooth 
then the convergence is uniform. This is useful to know, but we often desire 
a Fourier representation for a function with discontinuities. A stronger result 
is that if / is piecewise continuous in [0, L], i.e., continuous with the excep- 
tion of a finite number of discontinuities, then the Fourier series will converge 
pointwise (but not uniformly 1 ) to f(x) at points where f(x) is continuous, 
and to its average 



at those points where f(x) has jumps. In the section B.3.2 we shall explain 
why the series converges to this average, and examine the nature of this 
convergence. 

Most functions of interest to engineers are piecewise continuous, and this 
result is then all that they require. In physics, however, we often have to 
work with a broader class of functions, and so other forms of of convergence 
become relevant. In quantum mechanics, in particular, the probability inter- 
pretation of the wavefunction requires only convergence in the L 2 sense, and 




(B.14) 




2 e^O 



(B.15) 



: If a sequence of continuous functions converges uniformly, then its limit function is 
continuous. 
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this demands no smoothness properties at all — the Fourier representation 
converging to / whenever the L 2 norm ||/|| 2 is finite. 

Half-range Fourier series 

The exponential series 

oo 

/(*)= a m e- 2mmx/L . (B.16) 

m=— oo 

can be re-expressed as the trigonometric sum 

f(x) = -A + {A m cos(2irmx/L) + L> m sin(27rm:r/L)} , (B.17) 

m=l 

where 

A f 2a m = 0, 

m \ a m + a_ m , m > 0, 

5 m = i(a_ m -a m ). (B.18) 

This is called a full-range trigonometric Fourier series for functions defined on 
[0, L] . In chapter 2 we expanded functions in series containing only sines. We 
can expand any function f(x) defined on a finite interval as such a half-range 
Fourier series. To do this, we regard the given domain of f(x) as being the 
half interval [0,L/2] (hence the name). We then extend f(x) to a function 
on the whole of [0, L] and expand as usual. If we extend f(x) by setting 
f(x + L/2) = —f(x) then the A m are all zero and we have 

oo 

f(x) = J2B m sm(27rmx/L), x G [0,L/2], (B.19) 

m=l 

where, 

B rn = - f(x)sm(2irmx/L)dx. (B.20) 
L Jo 

Alternatively, we may extend the range of definition by setting f(x + L/2) = 
/(L/2 — x). In this case it is the B m that become zero and we have 

1 

f(x) = 7, A o + J2 Am cos(27rma;/L), x G [0, L/2], (B.21) 

m=l 
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with 

4 ri/2 

A m = - f(x)cos(2irmx/L)dx. (B.22) 
^ Jo 

The difference between a full-range and a half-range series is therefore 
seen principally in the continuation of the function outside its initial interval 
of definition. A full range series repeats the function periodically. A half- 
range sine series changes the sign of the continued function each time we 
pass to an adjacent interval, whilst the half-range cosine series reflects the 
function as if each interval endpoint were a mirror. 



B.2 Fourier Integral Transforms 

When the function we wish represent is defined on the entirety of K. then we 
can use the Fourier integral representation. 



B.2.1 Inversion formula 

We can obtain this formally from the Fourier series for a function defined on 
[-L/2,L/2], where 

OO 

„ t ■. "S. 27rim „ , . 

f(x) = a me-— x , (B.23) 

m=— oo 

a m = - / f(x)e^ x dx, (B.24) 

L J-L/2 



by letting L become large. The discrete k m = 2nm/L then merge into the 
continuous variable k and 

°° /"OO poo 

E - / dm = L £■ (B.25) 

The product La m remains finite, and becomes a function that we shall call 
f(k). Thus 

f(k)e~ ik ^, (B.26) 

/•OO 

f(k) = / f{x)e lkx dx. (B.27) 



B.2. FOURIER INTEGRAL TRANSFORMS 



429 



This is the Fourier integral transform and its inverse. 

It is good practice when doing Fourier transforms in physics to treat x 
and k asymmetrically: always put the 27r's with the d/c's. This is because, 
as (B.25) shows, dk/2n has the physical meaning of the number of Fourier 
modes per unit (spatial) volume with wavenumber between k and k + dk. 

The Fourier representation of the Dirac delta-function is 

— e^ x ~ x '\ (B.28) 

-oo 27T 

Suppose we put x = x' . Then "5(0)", which we earlier saw can be interpreted 
as the inverse lattice spacing, and hence the density of lattice points, is equal 
to y~. This is the total number of Fourier modes per unit length. 

Exchanging x and k in the integral representation of 5(x — x') gives us 
the Fourier representation for S(k — k')\ 

/oo 
e i(k-k')x dx = 2n6(k-k'). (B.29) 
-oo 

Thus 2tc5(0) (in k space), although mathematically divergent, has the phys- 
ical meaning J dx, the volume of the system. It is good practice to put a 27r 
with each S(k) because this combination has a direct physical interpretation. 

Take care to note that the symbol 6(0) has a very different physical in- 
terpretation depending on whether 5 is a delta-function in a; or in A; space. 

Parseval's identity 

Note that with the Fourier transform pair defined as 

/oo 
e lkx f(x) dx (B.30) 
-oo 

/°° ~ dk 

j-*°f(k)-, (B.31) 

Pareseval's theorem takes the form 

Jf(x)\ 2 dx = J Jf{k)\ 2 -. (B.32) 

Parseval's theorem tells us that the Fourier transform is a unitary map 
from L 2 (R) -> L 2 ' 
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B.2.2 The Riemann-Lebesgue lemma 

There is a reciprocal relationship between the rates at which a function and 
its Fourier transform decay at infinity. The more rapidly the function decays, 
the more high frequency modes it must contain — and hence the slower the 
decay of its Fourier transform. Conversely, the smoother a function the fewer 
high frequency modes it contains and the faster the decay of its transform. 
Quantitative estimates of this version of Heisenberg's uncertainty principle 
are based on the Riemann-Lebesgue lemma. 

Recall that a function / is in L X (]R) if it is integrable (this condition 
excludes the delta function) and goes to zero at infinity sufficiently rapidly 
that 



\f\dx <oo. (B.33) 
If / G L 1 (M) then its Fourier transform 

/oo 
f(x)e ikx dx (B.34) 
-oo 

exists, is a continuous function of k, and 

1/0)1 < Wfh- (B.35) 
The Riemann-Lebesgue lemma asserts that if / G then 

lim f(k) = 0. (B.36) 

k~ »oo 

We will not give the proof. For / integrable in the Riemann sense, it is not 
difficult, being almost a corollary of the definition of the Riemann integral. 
We must point out, however, that the "| . . . |" modulus sign is essential in 
the L l condition. For example, the integral 



1=1 sin(x 2 )dx (B.37) 

J — oo 

is convergent, but only because of extensive cancellations. The L l norm 

\sm(x 2 )\dx (B.38) 
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is not finite, and whereas the Fourier transform of sm(x 2 ), i.e. 

sin(:r 2 ) e ikx dx = ^ cos I J , (B.39) 

is also convergent, it does not decay to zero as k grows large. 

The Riemann-Lebesgue lemma tells us that the Fourier transform maps 
L 1 (R) into Coo(R), the latter b eing the space of continuous functions vanish- 
ing at infinity. Be careful: this map is only into and not onto. The inverse 
Fourier transform of a function vanishing at infinity does not necessariliy lie 
in L 1 (R). 

We link the smoothness of f(x) to the rapid decay of f(k), by combining 
Riemann-Lebesgue with integration by parts. Suppose that both / and /' 
are in tftR). Then 



/OO POO ^ 

f{x) e lkx dx = -ik / f(x) e lkx dx = -ikf(k) (B.40) 
-oo J — oo 

tends to zero. (No boundary terms arise from the integration by parts be- 
cause in order for a differentiable function / to be in L 1 , it must tend to zero 
at infinity.) Since kf(k) tends to zero, f(k) itself must go to zero faster than 
1/k. We can continue in this manner and see that each additional derivative 
of / that lies in L 1 (IR) buys us an extra power of 1/k in the decay rate of 
/ at infinity. If any derivative possesses a jump discontinuity, however, its 
derivative will contain a delta- function, and a delta-function is not in L 1 . 
Thus, if n is the largest integer for which k n f(k) — > we may expect f( n \x) 
to be somewhere discontinuous. For example, the function f(x) = e - ^ has 
a^ first derivative that lies in L 1 , but is discontinuous. Its Fourier transform 
f(k) = 2/(1 + k 2 ) therefore decays as l/k 2 , but no faster. 

B.3 Convolution 

Suppose that f(x) and g(x) are functions on the real line R. We define their 
convolution f * g, when it exists, by 

/oo 
f(x-0g(0dt;. (B.4i) 
■oo 

A change of variable £ — > x— £ shows that, despite the apparently asymmetric 
treatment of / and g in the definition, the * product obeys / * g — g * f. 
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B.3.1 The convolution theorem 

Now, let f(k) denote the Fourier transforms of /, i.e. 

/oo 
e ikx f(x)dx. (B.42) 
-oo 

We claim that 

[f*9] = 7 9- (B.43) 
The following computation shows that this claim is correct: 

^ poo / POO \ 

\f*g](k) = / \ f(x-09(0dq dx 

J —oo \J — oo J 

/oo poo 
/ e**f(x-t)g(t)d£dx 
oo J —oo 

/oo poo 
/ e^-^e^f(x-09(0d^dx 
-oo J —oo 

/oo poo 
/ e ikx 'e^f(x')g(Od^dx' 
■oo J —oo 

(poo \ / poo 

J e ikx 'f(x')dx')^j e^g(Odt 



= f(k)g(k). (B.44) 

Note that we have freely interchanged the order of integrations. This is not 
always permissible, but it is allowed if /, g e ^(K), in which case / * g is 
also in L 1 OBO- 



ES. 3. 2 Apodization and Gibbs' phenomenon 

The convolution theorem is useful for understanding what happens when we 
truncate a Fourier series at a finite number of terms, or cut off a Fourier 
integral at a finite frequency or wavenumber. 

Consider, for example, the cut-off Fourier integral representation 

f A (x) = ^- [ A f(k)e- ikx dk, (B.45) 
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where f(k) = f(x) e tkx dx is the Fourier transform of /. We can write 
this as 

fo(x) = — / 9 A (k)f(k) e~ lkx dk (B.46) 

where 9\(k) is unity if \k\ < A and zero otherwise. Written this way, the 
Fourier transform of f\ becomes the product of the Fourier transform of the 
original / with 9\. The function f\ itself is therefore the convolution 

/oo 

*A(*-0/(Ode (B.47) 
-oo 

of / with 

5l(x) = I^M = J_ f 00 6 A (k)e-^ dk, (B.48) 

7T X Z7T > /_ 00 

which is the inverse Fourier transform of 9\(x). We see that f\(x) is a kind of 
local average of the values of f(x) smeared by the approximate delta- function 
5 A (x). (The superscript F stands for "Fourier"). 




A plot ofirSl(x) for A = 3. 



When fix) can be treated as a constant on the scale (~ 27r/A) of the oscilla- 
tion in S A (x), all that matters is that f_ 5\{x) dx = 1, and so f\(x) ~ f{x). 
This is case if f(x) is smooth and A is sufficiently large. However, if f(x) 
possesses a discontinuity at x , say, then we can never treat it as a constant 
and the rapid oscillations in 5\ (x) cause a "ringing" in f\(x) whose ampli- 
tude does not decrease (although the width of the region surrounding xq in 
which the effect is noticeable will decrease) as A grows. This ringing is known 
as Gibbs' phenomenon. 
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The Gibbs phenomenon: A Fourier reconstruction of a piecewise constant 
function that jumps discontinuously from y = —0.25 to +0.25 at x — 0.25. 

The amplitude of the ringing is largest immediately on either side of the the 
point of discontinuity, where it is about 9% of the jump in /. This magnitude 
is determined by the area under the central spike in S^(x), which is 



independent of A. For x exactly at the point of discontinuity, f\(x) receives 
equal contributions from both sides of the jump and hence converges to the 
average 



where f(x±) are the limits of / taken from the the right and left, respectively. 
When x = x — n/A, however, the central spike lies entirely to the left of the 
point of discontinuity and 



Consequently, /a (x) overshoots its target f(x-) by approximately 9% of the 
discontinuity. Similarly when x = x + 7r/A 




(B.49) 




(B.50) 



f A (x) » -{(l + 1.18)/(x_) + (l-1.18)/(x + )} 
» f(x-) + 0.09{f(x.)-f(x + )}. 



(B.51) 



/a(x) w/(x + ) + 0.09{/(x + )-/(x_)}. 



(B.52) 
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The ringing is a consequence of the abrupt truncation of the Fourier sum. 
If, instead of a sharp cutoff, we gradually de-emphasize the higher frequencies 
by the replacement 

f(k)^f(k)e- ak2 / 2 (B.53) 



then 



1 roc ^ 

f a (x) = — / f(k)e- ak2 e-^dk 

/oo 
^(x-Of(y)d^ (B.54) 
-oo 



where 



fi(x) = -j=e* / 2Q , (B.55) 

is a non-oscillating Gaussian approximation to a delta function. The effect 
of this convolution is to smooth out, or mollify, the original /, resulting in 
a C°° function. As a becomes small, the limit of f a (x) will again be the 
local average of f(x), so at a discontinuity f a will converge to the mean 

+ /(*-)}■ 

When reconstructing a signal from a finite range of its Fourier components- 
for example from the output of an aperture-synthesis radio-telescope — it is 
good practice to smoothly suppress the higher frequencies in such a manner. 
This process is called apodizing (i.e. cutting off the feet of) the data. If 
we fail to apodize then any interesting sharp feature in the signal will be 
surrounded by "diffraction ring" artifacts. 

Exercise B.l : Suppose that we exponentially suppress the higher frequencies 
by multiplying the Fourier amplitude f(k) by e _e ' fc L Show that the original 
signal is smoothed by convolution with a Lorentzian approximation to a delta 
function 



vre 2 + (x-0 2 ' 
Observe that 



Urn St 1 (x) = 5(x). 



Exercise B.2: Consider the apodized Fourier series 

oo 

fr(0)= £ a n r^e me , 
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where the parameter r lies in the range < r < 1, and the coefficients are 

a n ^^ j\- m0 f(6)de. 

Assuming that it is legitimate to interchange the order of the sum and integral, 
show that 

f 2n 

M9) = J o df(9-e')f(e')de' 

Here the superscript P stands for for Poisson because 5^(9) is the Poisson 
kernel that solves the Dirichlet problem in the unit disc. Show that 5^(6) 
tends to a delta function as r — > 1 from below. 



Exercise B.3: The periodic Hilbert transform. Show that in the limit r — > 1 
the sum 

00 id -id 

sgn (n)e^rM = - ™ < r < 1 

1 — re™ 1 — re ™ 

n=— oo 

becomes the principal-part distribution 

P(icot(0). 

Let f(9) be a smooth function on the unit circle, and define its Hilbert trans- 
form 7if to be 

WW) = h P C f{6 ' )cot {^pj de> 

Show the original function can be recovered from (Ttf)(9), together with 
knowledge of the angular average / = J Q 27r f{9) d9 /2tt, as 

-i i-2tt /a al\ i i-2-n 

f{9) = ~2^ P J {nf){e ' )cOt {—) d9 ' + ^J md6 ' 

= -{n 2 fW) + f. 
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Exercise B.4: Find a closed- form expression for the sum 

oo 

\n\e in6 r 2 ^, < r < 1. 

n=— oo 

Now let /(#) be a smooth function defined on the unit circle and 

1 f' 2w 

its n-th Fourier coefficient. By taking a limit r — > 1, show that 

*j:jn\a n a- n = -j o ^ [/(*) - /(* )] 2 cosec 2 j --, 

both the sum and integral being convergent. Show that these last two expres- 
sions are equal to 



Iff \V^\ 2 rdrd6 

2 J Jr<l 



where ip(r, 9) is the function harmonic in the unit disc, whose boundary value 
is f(6). 

Exercise B.5: Let f(k) be the Fourier transform of the smooth real function 
f(x). Take a suitable limit in the previous problem to show that that 

1 ' - 2 dk 



2ir' 



Exercise B.6: By taking a suitable limit in exercise B.3 show that, when acting 
on smooth functions / such that \ f\ dx is finite, we have H(Hf) = — /, 
where 



(w/x*)=- r ^dx' 

K J-oo X-X 1 



defines the Hilbert transform of a function on the real line. (Because TL gives 
zero when acting on a constant, some condition, such as ^ |/| dx being finite, 
is necessary if TC is to be invertible.) 
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B.4 The Poisson Summation Formula 

Suppose that f(x) is a smooth function that tends rapidly to zero at infinity. 
Then the series 

oo 

F(x) = f( x + nL ) ( B - 56 ) 

n=— oo 

converges to a smooth function of period L. It therefore has a Fourier ex- 
pansion 

oo 

F(x)= a m e~ 2mmx/L (B.57) 

m=— oo 

and we can compute the Fourier coefficients by integrating term-by-term 

a m = - [ F(x)e 2mmx/L dx 
L Jo 

1 

L 

= i \ f(x) e 2nimx/L dx 

1 

L 



i 00 i-L 

1 J f(x + nL)e 2mmx/L dx 



n=— 00 

'•OO 



00 

1 f( y 2nm/L). (B.58) 



Thus 

00 1 00 

fix + nL) = - J{2i(m/L)e- 2mmx l L . (B.59) 



L 

n=— 00 m=— 00 

If we set x = 0, this becomes 

00 .. oc 

E ^ nL ) iE /(Wi) • (B.60) 

n=— 00 m=— 00 

The equality of this pair of doubly infinite sums is called the Poisson sum- 
mation formula. 

Example: As the Fourier transform of a Gaussian is another Gaussian, the 
Poisson formula with L — 1 applied to f(x) = exp(— kx 2 ) gives 

/ — 00 

.2 it 



e-™ 2 = J- J2 e ~ mV/K ' (B.61) 



K 

m=— 00 
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and (rather more usefully) applied to exp(— ^tx 2 + ix6) gives 

oo rz oo 

J- e -^ 2 +™e =J^J2 e^ (e+2 ™ )2 . (B.62) 

n=— oo n=— oo 

The last identity is known as Jacobi's imaginary transformation. It reflects 
the equivalence of the eigenmode expansion and the met ho d-of- images solu- 
tion of the diffusion equation 

2d^~m (R63) 

on the unit circle. Notice that when t is small the sum on the right-hand side 
converges very slowly, whereas the sum on the left converges very rapidly. 
The opposite is true for large t. The conversion of a slowly converging series 
into a rapidly converging one is a standard application of the Poisson sum- 
mation formula. It is the prototype of many duality maps that exchange a 
physical model with a large coupling constant for one with weak coupling. 

If we take the limit t — > in (B.62), the right hand side approaches a 
sum of delta functions, and so gives us the useful identity 



\ T , ni.r 

2tt ^ 



■ = S(x + 27rn). (B.64) 



n=— oo 



The right-hand side of (B.64) is sometimes called the "Dirac comb." 

Exercise B.7: By applying the Poisson summation formula to the Fourier 
transform pair 

f(x) = e- e We- ixe , and f(k) 



e 2 + {k-e) 2 ' 

where e > 0, deduce that 

sinh e 2e 
coshe-cos(0-0') = ^ e 2 + (0 -0> + 2im) 2 ' ^ ' ' 

n=— oo 

Hence show that the Poisson kernel is equivalent to an infinite periodic sum 
of Lorentzians 

if 1 — r 2 \ 1 In r 



2vr \1 - 2rcos(0 - 9') + r 2 J " vr ^ (lnr) 2 + (0 - 0' + 27rn) 2 ' 
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diffraction, 234 

dimensional regularization, 226 

Dirac comb, 417 

Dirac notation, 368, 373 

bra and ket vectors, 373 

dagger (f) map, 373 
direct sum, 376 
Dirichlet 

principle, 213 
dispersion, 243 
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equation, 245 
distributions 

delta function, 73 

elements of L 2 as, 77 

Heaviside, 89, 211 

principal part, 80 

theory of, 74 
divergence theorem, 19 
divergence, in curvilinear co-ordinates, 
283 

domain 

of dependence, 197 
of differential operator, 109 
of function, 54 

dual space, 74, 370 

eigenvalue, 52, 388 

as Lagrange multiplier, 37 
eigenvector, 388 
endpoint 

fixed, 2 

singular, 316 

variable, 29 
energy 

as first integral, 16 

density, 23 
internal, 27 

field, 26 
enthalpy, 28 
entropy, 37 

specific, 28 
equivalence class, 55, 377 
equivalent sequence, 61 
Euler's equation, 28 
Euler-Lagrange equation, 3 
Euler-Mascheroni constant, 297 

Faraday, Michael, 24 



field theory, 19 
first integral, 10 
flow 

barotropic, 28 

incompressible, 33 

irrotational, 27 

Lagrangian versus Eulerian descrip- 
tion, 27 

Fourier series, 59, 404 

Fourier, Joseph, 195 

Fredholm alternative 

for differential operators, 147 

for integral equations, 342 

for systems of linear equations, 381 

Fredholm series, 362 

Fredhom determinant, 361 

Fridel sum rule, 146 

Frechet derivative, see functional deriva- 
tive 

function space, 52 

normed, 53 
functional 

definition, 2 

derivative, 3 

local, 2 

Gauss quadrature, 82 
Gaussian elimination, 349 
Gelfand-Dikii equation, 176 
generalized functions, see distributions 
generating function 

for Legendre polynomials, 290 

for Bessel functions, 298 
Gibbs' phenomenon, 411 
gradient 

in curvilinear co-ordinates, 282 
Gram-Schmidt procedure, 65, 288 
Green function 
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analyticity of causal, 163 
causal, 154, 196, 208 
construction of, 149 
modified, 156, 163 
symmetry of, 158 
group velocity, 247 

half-range Fourier series, 405 
hanging chain, see catenary 
Hankel function, 297 

spherical, 313 
harmonic oscillator, 129 
Haydock recursion, 85 
Heaviside function, 89 
Helmholtz decomposition, 214 
Helmholtz equation, 298 
Helmholtz-Hodge decomposition, 237 
Hermitian 

differential operator, see formally 
self-adjoint operator 

matrix, 110, 388, 389 
Hermitian conjugate 

matrix, 376 

operator, see adjoint 
heteroj unction, 122 
Hilbert space, 57 

rigged, 77 
Hilbert transform, 414, 415 
hydraulic jump, 266 

identity 

delta function as, 70 

matrix, 369 
image space, 369 
images, method of, 230 
index 

of operator, 381 
indicial equation, 103 



inequality 

Cauchy-Schwarz-Bunyakovsky, 57 

triangle, 55, 56, 58 
inertial wave, 277 
infimum, 53 
integral equation 

Fredholm, 331 

Volterra, 331 
integral kernel, 70 

Jordan form, 391 

Kelvin wedge, 251 
kernel, 369 

Kirchhoff approximation, 233 
Korteweg de Vries (KdV) equation, 
108 

Lagrange interpolation formula, 82 
Lagrange multiplier, 36 

as eigenvalue, 37 
Lagrange's identity, 111 
Lagrange, Joseph-Louis, 11 
Lagrangian, 11 

density, 19 
Lanczos algorithm, 85 
Laplace development, 385 
Laplacian 

acting on vector field, 238, 285 

in curvilinear co-ordinates, 285 
Laplaciamacting on vector field, 237 
Lax pair, 108, 274 
least upper bound, see supremum 
Levinson's theorem, 146 
limit-circle case, 316 
linear dependence, 366 
linear map, 368 
linear operator 

bounded, 353 
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closable, 356 

closed, 356 

compact, 353 

Fredholm, 353 

Hilbert-Schmidt, 354 
Liouville measure, 37 
Liouville's theorem, 96 
Lobachevski geometry, 42, 45 
LU decomposition, see Gaussian elim- 
ination 

Maupertuis, Pierre de, 11 
Maxwell's equations, 24 
Maxwell, James Clerk, 24 
measure, Liouville, 37 
Mehler's formula, 68 
metric tensor, 372 
minimal polynomial equation, 388 
monodromy, 103 

multipliers, undetermined, see Lagrange 
multipliers 

Neumann function, 297 
Neumann series, 360 
Noether's theorem, 18 
Noether, Emmy, 15 
non-degerate, 372 
norm 

L p , 55 

definition of, 55 

sup, 55 
nullspace, see kernel 
Nyquist criterion, 164 

observables, 119 
optical fibre, 309 
orthogonal 

complement, 377 
orthonormal set, 58 



Poschel- Teller equation, 139 
pairing, 75, 371 
Parseval's theorem, 64, 407 
particular integral, 99, 160 
Peierls, Rudolph, 134, 313 
phase shift, 135, 319 
phase space, 37 
phase velocity, 247 
Plateau's problem, 1, 4 
Plemelj, Josip, 168 
Poincare 

disc, 45 
point 

ordinary, 102 
regular singular, 102 
singular, 102 
singular endpoint, 316 
Poisson 

kernel, 417 
summation, 416 
Poisson kernel, 221 
polynomials 

Hermite, 67, 130 
Legendre, 66, 287, 355 
orthogonal, 65 
Tchebychef, 68, 343 
pressure, 28 

principal part integral, 80, 168, 343 
principle of least action, see action 
principle 

product 

inner, 371 
of matrices, 369 
pseudo-momentum, 23 
pseudo-potential, 322 



quadratic form, 393 
diagonalizing, 393 
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quotient 

of vector spaces, 377 

range, see image space 
range-nullspace theorem, 369 
rank 

column, 369 

of matrix, 369 
Rayleigh-Ritz method, 127 
realm, sublunary, 133 
recurrence relation 

for Bessel functions, 299, 315 

for orthogonal polynomials, 65 
resolvent, 170 
resonance, 137, 322 
Riemann-Hilbert problem, 349 
Riemann-Lebesgue lemma, 408 
Riesz-Frechet theorem, 76, 81 
Rodriguez' formula, 66, 288 
Routhian, 15 

scalar product, see product, inner 
scattering length, 319 
Schwartz space, 75 
Schwartz, Laurent, 74 
Schwarzian derivative, 105 
Scott Russell, John, 271 
Seeley coefficients, 177 
self-adjoint 

matrix, 388 
self-adjoint operator 

formally, 112 

truly, 118 
seminorm, 75 
sesquilinear, 371 
singular endpoint, 316 
singular integral equations, 343 
skew-symmetric form, see symplecti 
form 



soap film, 4 
soliton, 108, 270 
space 

LP, 55 

Banach, 57 

Hilbert, 57 

of C n functions, 52 

of test functions, 74 
spanning set, 367 
spectrum 

continuous, 125, 317 

discrete, 125 

point, see discrete spectrum 
spherical harmonics, 293 
string 

sliding, 30 

vibrating, 20 
Sturm-Liouville operator, 39, 40, 72, 
112 

supremum, 53 

symmetric differential operator, see for- 
mally self-adjoint operator 
symplectic form, 396 

Taylor column, 277 
tensor 

elastic constants, 44 

energy-momentum, 22 

metric, 372 

momentum flux, 28 

strain, 44 

stress, 44 
test function, 74 
theorem 

addition 
for spherical harmonics, 295 

Cayley's, 385 

Green's, 228 
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mean value for harmonic functions, 
221 

range-nullspace, 369 

Riesz-Frechet, 76, 81 

Weierstrass approximation, 66 

Weyl's, 316 
tidal bore, see hydraulic jump 
transform 

Fourier, 332, 406 

Fourier-Bessel, see Hankel 

Hankel, 307 

Hilbert, 414, 415 

Laplace, 332, 334 

Legendre, 15, 29 

Mellin, 332 

Mellin sine, 222 

Radon, 337 



approximation theorem, 66 
Weyl's 

identity, 399 

theorem, 316 
Weyl, Hermann, 134, 316 
Wiener-Hopf 

integral equations, 347 
Wronskian, 94 

and linear dependence, 96 

in Green function, 150 



variational principle, 127 
vector 

Laplacian, 237, 238, 285 
vector space 

definition, 365 
velocity potential, 27 

as lagrange multiplier, 39 
vorticity, 28 



wake, 250 

ship, 251 
wave 

drag, 250 

equation, 190 

momentum, see pseudo-momentum 
non-linear, 259 
shock, 263 
surface, 33, 243 
transverse, 21 
Weierstrass 



